Posted in | News | Imaging

Improving Deep Learning and Segmentation of Remote Sensing Images

Download PDF Copy

By Taha KhanReviewed by Megan Craig, M.Sc.Nov 15 2022

In an article published in the journal Remote Sensing, researchers proposed a deep-separation-guided progressive reconstruction network that achieves accurate remote sensing image (RSI) segmentation.

Improving Deep Learning and Segmentation of Remote Sensing Images

Study: Deep-Separation Guided Progressive Reconstruction Network for Semantic Segmentation of Remote Sensing Images. Image Credit: Connect world/Shutterstock.com

Semantic Segmentation

The goal of semantic segmentation is to categorize the pixels in a picture. Semantic segmentation is essential for several remote sensing applications, including urban planning, land cover categorization, and scene comprehension. Semantic segmentation of remote sensing images (RSIs) is progressively adopting deep learning (DL) techniques due to the success of DL and the encouraging findings on several semantic segmentation benchmarks comprising real photos.

However, an RSI is much bigger than a normal, natural picture for computer vision applications since it comprises objects of various sizes and depicts complicated settings. Additionally, issues with multi-scale alterations might be made worse by the slanted viewpoint of RSIs during data collecting, which can cause scale fluctuations in objects collected at various distances.

Convolutional neural networks have ushered in a new age of computer vision with the ongoing development of DL. Most networks use encoder-decoder topologies for tasks like semantic segmentation.

Encoder-Decoder Topologies

Typical encoders include the transformer, ResNet, and VGG. These encoders can extract features, although their capacity is rather limited. Therefore, managing the decoder's ability to reconstruct the features is critical for network performance gains. Different architectures have been developed for decoding, although usually, bottom-up approaches are utilized following feature extraction. For example, UNet is a typical decoder and the foundation for many later-created networks.

Previous Studies

Several studies have combined features with various scales after extraction to enhance feature reconstruction. These techniques include typical convolutional and upsampling layers in the decoder. Although it is easy to construct, this kind of decoder is ineffective. Furthermore, the encoder's features include various degrees of significance at various resolutions. In other words, present techniques are unable to benefit from these properties. Therefore, in the design model, it is essential to understand how to rebuild features effectively and collaborate.

Digital Surface Model (DSM)

Several previous studies proposed various methods for semantic segmentation. For instance, a DSM (Digital Surface Model) was used to handle quality variations across multimodality RSI datasets as auxiliary information to improve the segmentation performance of the model on single-modal data. To combine multimodal information and thoroughly explore the characteristics of various phases, another research developed the feature separation and aggregation models.

Novel Network Architecture DGPRNet

In this study, researchers suggest a deep separation module (DSEM)-based deep separation-guided progressive reconstruction network (DGPRNet) for the semantic segmentation of RSIs. They created a progressive reconstruction block (PRB) based on atrous spatial pyramid pooling (ASPP) with numerous convolutional layers integrating diverse receptive fields for remodeling properties at each resolution to enhance feature reconstruction.

The PRBs employ deconvolution to change the resolution, rising through each block until the input picture is solved, in contrast to other approaches reliant on upsampling to do so. Additionally, the proposed deep separation module (DSEM) analyzes semantic data such that pixels belonging to the same class are grouped, and the separation between pixels from different classes is maximized to improve the forward guidance of deep semantic features to shallow layers.

How the Study was Conducted

Improving Feature Reconstruction

A PRB based on ASPP was included in the decoder to improve feature reconstruction and lower error rates. The decoding output of each block was obtained by processing five features in parallel using several convolution layers with various ratios and then expanding the feature resolution by deconvolution.

To highlight semantic information and leverage deep semantic features, the suggested DSEM processed the last three semantic characteristics from the decoder. Multi-supervision was used to segment DGPRNet, which increased each module's capacity for reconstruction.

Experimentation

Potsdam and Vaihingen RSI Datasets

Semantic segmentation tests were conducted to assess the effectiveness of the proposed DGPRNet using the Potsdam and Vaihingen RSI datasets. 38 patches of 6000 x 6000 pixels made up the Potsdam dataset; 17 patches were used for training, and seven for testing. The Vaihingen dataset also included 33 pictures of 2494 x 2064 pixel resolution divided into 16 training patches and five testing patches.

They utilized the average intersection over union and the average pixel accuracy of each class as performance metrics for assessment. The intersection over union method was used between the prediction and target areas to acquire the ideal segmentation weight. The intersection over union was the primary indicator used to train and test various algorithms on the two RSI datasets.

Conclusion

To semantically partition remote sensing pictures, this study developed a novel network architecture called DGPRNet by investigating the connections between and within classes of deep features and reducing feature reconstruction loss in the decoder. Before decoding, nearby intermediate features were first supplemented to enhance the expression of multi-scale characteristics.

Improving Deep Learning and Segmentation of Remote Sensing Images - MondayThen, PRB was created and implemented at five stages in the decoder to capture specific characteristics from various receiving fields at various resolutions. This reduced error and maintained accuracy throughout reconstruction. Finally, to use deep features in recognizing objects of various sizes, the suggested DSEM discriminated and aggregated interclass and intraclass characteristics based on semantic features. According to experimental findings on two RSI datasets, DGPRNet beat 11 cutting-edge techniques, including the most recent semantic segmentation techniques.

Reference

Jiabao Ma, Wujie Zhou, Xiaohong Qian and Lu Yu (2022) Deep-Separation Guided Progressive Reconstruction Network for Semantic Segmentation of Remote Sensing Images. Remote Sensing. https://www.mdpi.com/2072-4292/14/21/5510

Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.

Written by

Taha Khan

Taha graduated from HITEC University Taxila with a Bachelors in Mechanical Engineering. During his studies, he worked on several research projects related to Mechanics of Materials, Machine Design, Heat and Mass Transfer, and Robotics. After graduating, Taha worked as a Research Executive for 2 years at an IT company (Immentia). He has also worked as a freelance content creator at Lancerhop. In the meantime, Taha did his NEBOSH IGC certification and expanded his career opportunities.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Khan, Taha. (2022, November 17). Improving Deep Learning and Segmentation of Remote Sensing Images. AZoOptics. Retrieved on August 21, 2025 from https://www.azooptics.com/News.aspx?newsID=28102.
MLA
Khan, Taha. "Improving Deep Learning and Segmentation of Remote Sensing Images". AZoOptics. 21 August 2025. <https://www.azooptics.com/News.aspx?newsID=28102>.
Chicago
Khan, Taha. "Improving Deep Learning and Segmentation of Remote Sensing Images". AZoOptics. https://www.azooptics.com/News.aspx?newsID=28102. (accessed August 21, 2025).
Harvard
Khan, Taha. 2022. Improving Deep Learning and Segmentation of Remote Sensing Images. AZoOptics, viewed 21 August 2025, https://www.azooptics.com/News.aspx?newsID=28102.