Journal articles on the topic 'Scene parsing'

To see the other types of publications on this topic, follow the link: Scene parsing.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Scene parsing.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Shi, Hengcan, Hongliang Li, Fanman Meng, Qingbo Wu, Linfeng Xu, and King Ngi Ngan. "Hierarchical Parsing Net: Semantic Scene Parsing From Global Scene to Objects." IEEE Transactions on Multimedia 20, no. 10 (October 2018): 2670–82. http://dx.doi.org/10.1109/tmm.2018.2812600.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Ce Liu, J. Yuen, and A. Torralba. "Nonparametric Scene Parsing via Label Transfer." IEEE Transactions on Pattern Analysis and Machine Intelligence 33, no. 12 (December 2011): 2368–82. http://dx.doi.org/10.1109/tpami.2011.131.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Zhang, Rui, Sheng Tang, Yongdong Zhang, Jintao Li, and Shuicheng Yan. "Perspective-Adaptive Convolutions for Scene Parsing." IEEE Transactions on Pattern Analysis and Machine Intelligence 42, no. 4 (April 1, 2020): 909–24. http://dx.doi.org/10.1109/tpami.2018.2890637.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Xuelong Li, Lichao Mou, and Xiaoqiang Lu. "Scene Parsing From an MAP Perspective." IEEE Transactions on Cybernetics 45, no. 9 (September 2015): 1876–86. http://dx.doi.org/10.1109/tcyb.2014.2361489.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Zhang, Botao, Tao Hong, Rong Xiong, and Sergey A. Chepinskiy. "A terrain segmentation method based on pyramid scene parsing-mobile network for outdoor robots." International Journal of Advanced Robotic Systems 18, no. 5 (September 1, 2021): 172988142110486. http://dx.doi.org/10.1177/17298814211048633.

Full text
Abstract:
Terrain segmentation is of great significance to robot navigation, cognition, and map building. However, the existing vision-based methods are challenging to meet the high-accuracy and real-time performance. A terrain segmentation method with a novel lightweight pyramid scene parsing mobile network is proposed for terrain segmentation in robot navigation. It combines the feature extraction structure of MobileNet and the encoding path of pyramid scene parsing network. The depthwise separable convolution, the spatial pyramid pooling, and the feature fusion are employed to reduce the onboard computing time of pyramid scene parsing mobile network. A unique data set called Hangzhou Dianzi University Terrain Dataset is constructed for terrain segmentation, which contains more than 4000 images from 10 different scenes. The data set was collected from a robot’s perspective to make it more suitable for robotic applications. Experimental results show that the proposed method has high-accuracy and real-time performance on the onboard computer. Moreover, its real-time performance is better than most state-of-the-art methods for terrain segmentation.
APA, Harvard, Vancouver, ISO, and other styles
6

Chen, Xiaoyu, Chuan Wang, Jun Lu, Lianfa Bai, and Jing Han. "Road-Scene Parsing Based on Attentional Prototype-Matching." Sensors 22, no. 16 (August 17, 2022): 6159. http://dx.doi.org/10.3390/s22166159.

Full text
Abstract:
Road-scene parsing is complex and changeable; the interferences in the background destroy the visual structure in the image data, increasing the difficulty of target detection. The key to addressing road-scene parsing is to amplify the feature differences between the targets, as well as those between the targets and the background. This paper proposes a novel scene-parsing network, Attentional Prototype-Matching Network (APMNet), to segment targets by matching candidate features with target prototypes regressed from labeled road-scene data. To obtain reliable target prototypes, we designed the Sample-Selection and the Class-Repellence Algorithm in the prototype-regression progress. Also, we built the class-to-class and target-to-background attention mechanisms to increase feature recognizability based on the target’s visual characteristics and spatial-target distribution. Experiments conducted on two road-scene datasets, CamVid and Cityscapes, demonstrate that our approach effectively improves the representation of targets and achieves impressive results compared with other approaches.
APA, Harvard, Vancouver, ISO, and other styles
7

Zhang, Pingping, Wei Liu, Yinjie Lei, Hongyu Wang, and Huchuan Lu. "Deep Multiphase Level Set for Scene Parsing." IEEE Transactions on Image Processing 29 (2020): 4556–67. http://dx.doi.org/10.1109/tip.2019.2957915.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Boutell, Matthew R., Jiebo Luo, and Christopher M. Brown. "Scene Parsing Using Region-Based Generative Models." IEEE Transactions on Multimedia 9, no. 1 (January 2007): 136–46. http://dx.doi.org/10.1109/tmm.2006.886372.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Hager, Gregory D., and Ben Wegbreit. "Scene parsing using a prior world model." International Journal of Robotics Research 30, no. 12 (June 3, 2011): 1477–507. http://dx.doi.org/10.1177/0278364911399340.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Bu, Shuhui, Pengcheng Han, Zhenbao Liu, and Junwei Han. "Scene parsing using inference Embedded Deep Networks." Pattern Recognition 59 (November 2016): 188–98. http://dx.doi.org/10.1016/j.patcog.2016.01.027.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Zhao, Hao, Ming Lu, Anbang Yao, Yiwen Guo, Yurong Chen, and Li Zhang. "Pointly-supervised scene parsing with uncertainty mixture." Computer Vision and Image Understanding 200 (November 2020): 103040. http://dx.doi.org/10.1016/j.cviu.2020.103040.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Ou, Xin-Yu, Ping Li, He-Fei Ling, Si Liu, Tian-Jiang Wang, and Dan Li. "Objectness Region Enhancement Networks for Scene Parsing." Journal of Computer Science and Technology 32, no. 4 (July 2017): 683–700. http://dx.doi.org/10.1007/s11390-017-1751-x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Bai, Haiwei, Jian Cheng, Yanzhou Su, Qi Wang, Haoran Han, and Yijie Zhang. "Multi-Branch Adaptive Hard Region Mining Network for Urban Scene Parsing of High-Resolution Remote-Sensing Images." Remote Sensing 14, no. 21 (November 2, 2022): 5527. http://dx.doi.org/10.3390/rs14215527.

Full text
Abstract:
Scene parsing of high-resolution remote-sensing images (HRRSIs) refers to parsing different semantic regions from the images, which is an important fundamental task in image understanding. However, due to the inherent complexity of urban scenes, HRRSIs contain numerous object classes. These objects present large-scale variation and irregular morphological structures. Furthermore, their spatial distribution is uneven and contains substantial spatial details. All these features make it difficult to parse urban scenes accurately. To deal with these dilemmas, in this paper, we propose a multi-branch adaptive hard region mining network (MBANet) for urban scene parsing of HRRSIs. MBANet consists of three branches, namely, a multi-scale semantic branch, an adaptive hard region mining (AHRM) branch, and an edge branch. First, the multi-scale semantic branch is constructed based on a feature pyramid network (FPN). To reduce the memory footprint, ResNet50 is chosen as the backbone, which, combined with the atrous spatial pyramid pooling module, can extract rich multi-scale contextual information effectively, thereby enhancing object representation at various scales. Second, an AHRM branch is proposed to enhance feature representation of hard regions with a complex distribution, which would be difficult to parse otherwise. Third, the edge-extraction branch is introduced to supervise boundary perception training so that the contours of objects can be better captured. In our experiments, the three branches complemented each other in feature extraction and demonstrated state-of-the-art performance for urban scene parsing of HRRSIs. We also performed ablation studies on two HRRSI datasets from ISPRS and compared them with other methods.
APA, Harvard, Vancouver, ISO, and other styles
14

Choe, Gyeongmin, Seong-Heum Kim, Sunghoon Im, Joon-Young Lee, Srinivasa G. Narasimhan, and In So Kweon. "RANUS: RGB and NIR Urban Scene Dataset for Deep Scene Parsing." IEEE Robotics and Automation Letters 3, no. 3 (July 2018): 1808–15. http://dx.doi.org/10.1109/lra.2018.2801390.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Mao, Aihua, Yuan Liang, Jianbo Jiao, Yongtuo Liu, and Shengfeng He. "Mask-Guided Deformation Adaptive Network for Human Parsing." ACM Transactions on Multimedia Computing, Communications, and Applications 18, no. 1 (January 31, 2022): 1–20. http://dx.doi.org/10.1145/3467889.

Full text
Abstract:
Due to the challenges of densely compacted body parts, nonrigid clothing items, and severe overlap in crowd scenes, human parsing needs to focus more on multilevel feature representations compared to general scene parsing tasks. Based on this observation, we propose to introduce the auxiliary task of human mask and edge detection to facilitate human parsing. Different from human parsing, which exploits the discriminative features of each category, human mask and edge detection emphasizes the boundaries of semantic parsing regions and the difference between foreground humans and background clutter, which benefits the parsing predictions of crowd scenes and small human parts. Specifically, we extract human mask and edge labels from the human parsing annotations and train a shared encoder with three independent decoders for the three mutually beneficial tasks. Furthermore, the decoder feature maps of the human mask prediction branch are further exploited as attention maps, indicating human regions to facilitate the decoding process of human parsing and human edge detection. In addition to these auxiliary tasks, we further alleviate the problem of deformed clothing items under various human poses by tracking the deformation patterns with the deformable convolution. Extensive experiments show that the proposed method can achieve superior performance against state-of-the-art methods on both single and multiple human parsing datasets. Codes and trained models are available https://github.com/ViktorLiang/MGDAN .
APA, Harvard, Vancouver, ISO, and other styles
16

Bauer, Daniel. "Understanding Descriptions of Visual Scenes Using Graph Grammars." Proceedings of the AAAI Conference on Artificial Intelligence 27, no. 1 (June 29, 2013): 1656–57. http://dx.doi.org/10.1609/aaai.v27i1.8498.

Full text
Abstract:
Automatic generation of 3D scenes from descriptions has applications in communication, education, and entertainment, but requires deep understanding of the input text. I propose thesis work on language understanding using graph-based meaning representations that can be decomposed into primitive spatial relations. The techniques used for analyzing text and transforming it into a scene representation are based on context-free graph grammars. The thesis develops methods for semantic parsing with graphs, acquisition of graph grammars, and satisfaction of spatial and world-knowledge constraints during parsing.
APA, Harvard, Vancouver, ISO, and other styles
17

Li, Xiangtai, Li Zhang, Guangliang Cheng, Kuiyuan Yang, Yunhai Tong, Xiatian Zhu, and Tao Xiang. "Global Aggregation Then Local Distribution for Scene Parsing." IEEE Transactions on Image Processing 30 (2021): 6829–42. http://dx.doi.org/10.1109/tip.2021.3099366.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Huang, Shaofei, Si Liu, Tianrui Hui, Jizhong Han, Bo Li, Jiashi Feng, and Shuicheng Yan. "ORDNet: Capturing Omni-Range Dependencies for Scene Parsing." IEEE Transactions on Image Processing 29 (2020): 8251–63. http://dx.doi.org/10.1109/tip.2020.3013142.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Alvarez, Jose M., Mathieu Salzmann, and Nick Barnes. "Exploiting Large Image Sets for Road Scene Parsing." IEEE Transactions on Intelligent Transportation Systems 17, no. 9 (September 2016): 2456–65. http://dx.doi.org/10.1109/tits.2016.2522506.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Razzaghi, Parvin, and Shadrokh Samavi. "A new fast approach to nonparametric scene parsing." Pattern Recognition Letters 42 (June 2014): 56–64. http://dx.doi.org/10.1016/j.patrec.2014.01.003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Li, Teng, Xinyu Wu, Bingbing Ni, Ke Lu, and Shuicheng Yan. "Weakly-supervised scene parsing with multiple contextual cues." Information Sciences 323 (December 2015): 59–72. http://dx.doi.org/10.1016/j.ins.2015.06.024.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Flasiński, Mariusz. "Parsing of edNLC-graph grammars for scene analysis." Pattern Recognition 21, no. 6 (January 1988): 623–29. http://dx.doi.org/10.1016/0031-3203(88)90034-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Talebi, Mehdi, Abbas Vafaei, and S. Amirhassan Monadjemi. "Nonparametric scene parsing in the images of buildings." Computers & Electrical Engineering 70 (August 2018): 777–88. http://dx.doi.org/10.1016/j.compeleceng.2018.01.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Zou, Chuhang, Ruiqi Guo, Zhizhong Li, and Derek Hoiem. "Complete 3D Scene Parsing from an RGBD Image." International Journal of Computer Vision 127, no. 2 (November 21, 2018): 143–62. http://dx.doi.org/10.1007/s11263-018-1133-z.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Wang, Yinduo, Haofeng Zhang, Shidong Wang, Yang Long, and Longzhi Yang. "Semantic combined network for zero-shot scene parsing." IET Image Processing 14, no. 4 (March 27, 2020): 757–65. http://dx.doi.org/10.1049/iet-ipr.2019.0870.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Wang, ZeYu, YanXia Wu, ShuHui Bu, PengCheng Han, and GuoYin Zhang. "Structural inference embedded adversarial networks for scene parsing." PLOS ONE 13, no. 4 (April 12, 2018): e0195114. http://dx.doi.org/10.1371/journal.pone.0195114.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Niehorster, Diederick C., and Li Li. "Accuracy and Tuning of Flow Parsing for Visual Perception of Object Motion During Self-Motion." i-Perception 8, no. 3 (May 18, 2017): 204166951770820. http://dx.doi.org/10.1177/2041669517708206.

Full text
Abstract:
How do we perceive object motion during self-motion using visual information alone? Previous studies have reported that the visual system can use optic flow to identify and globally subtract the retinal motion component resulting from self-motion to recover scene-relative object motion, a process called flow parsing. In this article, we developed a retinal motion nulling method to directly measure and quantify the magnitude of flow parsing (i.e., flow parsing gain) in various scenarios to examine the accuracy and tuning of flow parsing for the visual perception of object motion during self-motion. We found that flow parsing gains were below unity for all displays in all experiments; and that increasing self-motion and object motion speed did not alter flow parsing gain. We conclude that visual information alone is not sufficient for the accurate perception of scene-relative motion during self-motion. Although flow parsing performs global subtraction, its accuracy also depends on local motion information in the retinal vicinity of the moving object. Furthermore, the flow parsing gain was constant across common self-motion or object motion speeds. These results can be used to inform and validate computational models of flow parsing.
APA, Harvard, Vancouver, ISO, and other styles
28

Tan, Xin, Ke Xu, Ying Cao, Yiheng Zhang, Lizhuang Ma, and Rynson W. H. Lau. "Night-Time Scene Parsing With a Large Real Dataset." IEEE Transactions on Image Processing 30 (2021): 9085–98. http://dx.doi.org/10.1109/tip.2021.3122004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Bhowmick, Alexy, Sarat Saharia, and Shyamanta M. Hazarika. "Non-parametric scene parsing: Label transfer methods and datasets." Computer Vision and Image Understanding 219 (May 2022): 103418. http://dx.doi.org/10.1016/j.cviu.2022.103418.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Ai, Xinbo, Yunhao Xie, Yinan He, and Yi Zhou. "Improve SegNet with feature pyramid for road scene parsing." E3S Web of Conferences 260 (2021): 03012. http://dx.doi.org/10.1051/e3sconf/202126003012.

Full text
Abstract:
Road scene parsing is a common task in semantic segmentation. Its images have characteristics of containing complex scene context and differing greatly among targets of the same category from different scales. To address these problems, we propose a semantic segmentation model combined with edge detection. We extend the segmentation network with an encoder-decoder structure by adding an edge feature pyramid module, namely Edge Feature Pyramid Network (EFPNet, for short). This module uses edge detection operators to get boundary information and then combines the multiscale features to improve the ability to recognize small targets. EFPNet can make up the shortcomings of convolutional neural network features, and it helps to produce smooth segmentation. After extracting features of the encoder and decoder, EFPNet uses Euclidean distance to compare the similarity between the presentation of the encoder and the decoder, which can increase the decoder’s ability to restore from the encoder. We evaluated the proposed method on Cityscapes datasets. The experiment on Cityscapes datasets demonstrates that the accuracies are improved by 7.5% and 6.2% over the popular SegNet and ENet. And the ablation experiment validates the effectiveness of our method.
APA, Harvard, Vancouver, ISO, and other styles
31

Wang, Zhifan, Tong Xin, Shidong Wang, and Haofeng Zhang. "Depth-embedded instance segmentation network for urban scene parsing." Journal of Intelligent & Fuzzy Systems 42, no. 3 (February 2, 2022): 1269–79. http://dx.doi.org/10.3233/jifs-202230.

Full text
Abstract:
The ubiquitous availability of cost-effective cameras has rendered large scale collection of street view data a straightforward endeavour. Yet, the effective use of these data to assist autonomous driving remains a challenge, especially lack of exploration and exploitation of stereo images with abundant perceptible depth. In this paper, we propose a novel Depth-embedded Instance Segmentation Network (DISNet) which can effectively improve the performance of instance segmentation by incorporating the depth information of stereo images. The proposed network takes binocular images as input to observe the displacement of the object and estimate the corresponding depth perception without additional supervisions. Furthermore, we introduce a new module for computing the depth cost-volume, which can be integrated with the colour cost-volume to jointly capture useful disparities of stereo images. The shared-weights structure of Siamese Network is applied to learn the intrinsic information of stereo images while reducing the computational burden. Extensive experiments have been carried out on publicly available datasets (i.e., Cityscapes and KITTI), and the obtained results clearly demonstrate the superiority in segmenting instances with different depths.
APA, Harvard, Vancouver, ISO, and other styles
32

Xiang, Zhenzhen, Anbo Bao, Jie Li, and Jianbo Su. "Boosting Real-Time Driving Scene Parsing With Shared Semantics." IEEE Robotics and Automation Letters 5, no. 2 (April 2020): 596–603. http://dx.doi.org/10.1109/lra.2020.2965075.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Jeong, C. Y., H. S. Yang, and K. D. Moon. "Horizon detection in maritime images using scene parsing network." Electronics Letters 54, no. 12 (June 2018): 760–62. http://dx.doi.org/10.1049/el.2018.0989.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Fu, Huan, Mingming Gong, Chaohui Wang, and Dacheng Tao. "MoE-SPNet: A mixture-of-experts scene parsing network." Pattern Recognition 84 (December 2018): 226–36. http://dx.doi.org/10.1016/j.patcog.2018.07.020.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Jin, Xin, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, and Zhibo Chen. "CASINet: Content-Adaptive Scale Interaction Networks for scene parsing." Neurocomputing 419 (January 2021): 9–22. http://dx.doi.org/10.1016/j.neucom.2020.08.014.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Yu, Tianshu, and Ruisheng Wang. "Scene parsing using graph matching on street-view data." Computer Vision and Image Understanding 145 (April 2016): 70–80. http://dx.doi.org/10.1016/j.cviu.2016.01.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

SHI, Jian-feng, Ning XANG, and A.-chuan WANG. "High resolution scene parsing network based on semantic segmentation." Chinese Journal of Liquid Crystals and Displays 37, no. 12 (2022): 1598–606. http://dx.doi.org/10.37188/cjlcd.2022-0174.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Zhou, Wujie, Shaohua Dong, Caie Xu, and Yaguan Qian. "Edge-Aware Guidance Fusion Network for RGB–Thermal Scene Parsing." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 3571–79. http://dx.doi.org/10.1609/aaai.v36i3.20269.

Full text
Abstract:
RGB–thermal scene parsing has recently attracted increasing research interest in the field of computer vision. However, most existing methods fail to perform good boundary extraction for prediction maps and cannot fully use high-level features. In addition, these methods simply fuse the features from RGB and thermal modalities but are unable to obtain comprehensive fused features. To address these problems, we propose an edge-aware guidance fusion network (EGFNet) for RGB–thermal scene parsing. First, we introduce a prior edge map generated using the RGB and thermal images to capture detailed information in the prediction map and then embed the prior edge information in the feature maps. To effectively fuse the RGB and thermal information, we propose a multimodal fusion module that guarantees adequate cross-modal fusion. Considering the importance of high-level semantic information, we propose a global information module and a semantic information module to extract rich semantic information from the high-level features. For decoding, we use simple elementwise addition for cascaded feature fusion. Finally, to improve the parsing accuracy, we apply multitask deep supervision to the semantic and boundary maps. Extensive experiments were performed on benchmark datasets to demonstrate the effectiveness of the proposed EGFNet and its superior performance compared with state-of-the-art methods. The code and results can be found at https://github.com/ShaohuaDong2021/EGFNet.
APA, Harvard, Vancouver, ISO, and other styles
39

LI, Yan-Li, Zhong ZHOU, and Wei WU. "Scene Parsing Based on A Two-Level Conditional Random Field." Chinese Journal of Computers 36, no. 9 (March 19, 2014): 1898–907. http://dx.doi.org/10.3724/sp.j.1016.2013.01898.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Pei, Yu, Bin Sun, and Shutao Li. "Multifeature Selective Fusion Network for Real-Time Driving Scene Parsing." IEEE Transactions on Instrumentation and Measurement 70 (2021): 1–12. http://dx.doi.org/10.1109/tim.2021.3070611.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Qian, Rui, Yunchao Wei, Honghui Shi, Jiachen Li, Jiaying Liu, and Thomas Huang. "Weakly Supervised Scene Parsing with Point-Based Distance Metric Learning." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 8843–50. http://dx.doi.org/10.1609/aaai.v33i01.33018843.

Full text
Abstract:
Semantic scene parsing is suffering from the fact that pixellevel annotations are hard to be collected. To tackle this issue, we propose a Point-based Distance Metric Learning (PDML) in this paper. PDML does not require dense annotated masks and only leverages several labeled points that are much easier to obtain to guide the training process. Concretely, we leverage semantic relationship among the annotated points by encouraging the feature representations of the intra- and intercategory points to keep consistent, i.e. points within the same category should have more similar feature representations compared to those from different categories. We formulate such a characteristic into a simple distance metric loss, which collaborates with the point-wise cross-entropy loss to optimize the deep neural networks. Furthermore, to fully exploit the limited annotations, distance metric learning is conducted across different training images instead of simply adopting an image-dependent manner. We conduct extensive experiments on two challenging scene parsing benchmarks of PASCALContext and ADE 20K to validate the effectiveness of our PDML, and competitive mIoU scores are achieved.
APA, Harvard, Vancouver, ISO, and other styles
42

Song, Qi, Kangfu Mei, and Rui Huang. "AttaNet: Attention-Augmented Network for Fast and Accurate Scene Parsing." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 3 (May 18, 2021): 2567–75. http://dx.doi.org/10.1609/aaai.v35i3.16359.

Full text
Abstract:
Two factors have proven to be very important to the performance of semantic segmentation models: global context and multi-level semantics. However, generating features that capture both factors always leads to high computational complexity, which is problematic in real-time scenarios. In this paper, we propose a new model, called Attention-Augmented Network (AttaNet), to capture both global context and multi-level semantics while keeping the efficiency high. AttaNet consists of two primary modules: Strip Attention Module (SAM) and Attention Fusion Module (AFM). Viewing that in challenging images with low segmentation accuracy, there are a significantly larger amount of vertical strip areas than horizontal ones, SAM utilizes a striping operation to reduce the complexity of encoding global context in the vertical direction drastically while keeping most of contextual information, compared to the non-local approaches. Moreover, AFM follows a cross-level aggregation strategy to limit the computation, and adopts an attention strategy to weight the importance of different levels of features at each pixel when fusing them, obtaining an efficient multi-level representation. We have conducted extensive experiments on two semantic segmentation benchmarks, and our network achieves different levels of speed/accuracy trade-offs on Cityscapes, e.g., 71 FPS/79.9% mIoU, 130 FPS/78.5% mIoU, and 180 FPS/70.1% mIoU, and leading performance on ADE20K as well.
APA, Harvard, Vancouver, ISO, and other styles
43

Shuai, Bing, Henghui Ding, Ting Liu, Gang Wang, and Xudong Jiang. "Toward Achieving Robust Low-Level and High-Level Scene Parsing." IEEE Transactions on Image Processing 28, no. 3 (March 2019): 1378–90. http://dx.doi.org/10.1109/tip.2018.2878975.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Tang, Ning, Haokui Xu, Jifan Zhou, Rende Shui, Mowei Shen, and Tao Gao. "A causal model of recursive scene parsing in human perception." Journal of Vision 18, no. 10 (September 1, 2018): 750. http://dx.doi.org/10.1167/18.10.750.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Shuai, Bing, Zhen Zuo, Gang Wang, and Bing Wang. "Scene Parsing With Integration of Parametric and Non-Parametric Models." IEEE Transactions on Image Processing 25, no. 5 (May 2016): 2379–91. http://dx.doi.org/10.1109/tip.2016.2533862.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Dai, Juting, and Xinyi Tang. "ResFusion: deeply fused scene parsing network for RGB‐D images." IET Computer Vision 12, no. 8 (September 3, 2018): 1171–78. http://dx.doi.org/10.1049/iet-cvi.2018.5218.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Liu, Xiaobai, Yibiao Zhao, and Song-Chun Zhu. "Single-View 3D Scene Reconstruction and Parsing by Attribute Grammar." IEEE Transactions on Pattern Analysis and Machine Intelligence 40, no. 3 (March 1, 2018): 710–25. http://dx.doi.org/10.1109/tpami.2017.2689007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Zhang, Ruimao, Liang Lin, Guangrun Wang, Meng Wang, and Wangmeng Zuo. "Hierarchical Scene Parsing by Weakly Supervised Learning with Image Descriptions." IEEE Transactions on Pattern Analysis and Machine Intelligence 41, no. 3 (March 1, 2019): 596–610. http://dx.doi.org/10.1109/tpami.2018.2799846.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Luo, Ao, Fan Yang, Xin Li, Rui Huang, and Hong Cheng. "EKENet: Efficient knowledge enhanced network for real-time scene parsing." Pattern Recognition 111 (March 2021): 107671. http://dx.doi.org/10.1016/j.patcog.2020.107671.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Büttner, Stefan, Zoltán-Csaba Márton, and Katharina Hertkorn. "Automatic scene parsing for generic object descriptions using shape primitives." Robotics and Autonomous Systems 76 (February 2016): 93–112. http://dx.doi.org/10.1016/j.robot.2015.11.003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography