*Important notice: This news reports on an unedited version of the paper which has been accepted and is awaiting final editing. Therefore, the study should not be regarded as conclusive or treated as established information.
Terrestrial laser scanning dataset provides detailed 3D tree data for species classification. This optics-based resource supports AI-driven forestry, improving biodiversity monitoring and ecological research accuracy.
Study: A Central European tree species dataset of annotated terrestrial laser scanning point clouds - TreeScanPL10K. Image Credit: FOTOGRIN/Shutterstock
In a recent article published in the journal Scientific Data, researchers introduced TreeScanPL10K, a comprehensive dataset of over 10,000 annotated terrestrial laser scanning (TLS) point clouds representing Central European tree species to advance precision forestry and ecological research.
Forest Data Challenges
Modern forestry faces growing demands for accuracy in biodiversity assessment, carbon accounting, and sustainable management, driven by challenges such as climate change and biodiversity loss. Remote sensing technologies play an essential role in meeting these spatial data needs, with terrestrial laser scanning emerging as an indispensable tool that complements traditional satellite and airborne sensing.
TLS uses near-infrared laser pulses scanned in 3D to acquire dense point clouds with precise geometry and intensity values of target surfaces. Unlike aerial sensors, which face issues such as occlusion under dense canopies, TLS provides "bottom-up" views free of such limitations, delivering detailed scans of individual trees, including stem shapes and crown architecture.
This optical precision forms the foundation for advanced analyses such as individual tree segmentation and species classification using deep learning. Yet, the development of these methods is bottlenecked by limited large-scale, well-annotated TLS datasets, which this study aims to address with TreeScanPL10K.
Data Collection & Processing
The dataset was acquired using TLS instrumentation, primarily the FARO Focus 3D X130 and Trimble TX5 phased laser scanners. Both devices emit laser beams that reflect once on surfaces, capturing three-dimensional coordinates alongside intensity values representing reflected beam strength.
Scanning was conducted at a quarter resolution, approximating 4 mm accuracy per 10 meters, balancing detail with efficiency. Each forest sample plot, covering a 500 m² circular area, was scanned from four positions arranged in a triangular layout with one central scan to maximize coverage and reduce occlusion. This multi-angle optical data collection ensures the thorough capture of tree structure and crown details.
To align scans from multiple positions into a coherent point cloud, reference spheres were placed in the plots, enabling accurate fusion via geometric transformations. This process maintained millimeter-level accuracy and facilitated integration of intensity data in the final product.
Subsequent annotation involved detecting individual trees using algorithms that exploit both spatial coordinates and laser intensity return characteristics to delineate stems and crowns. Species labels were assigned by matching TLS-detected trees to field inventory data using spatial proximity and diameter measurements.
An innovative attribute, “completelyInside,” was introduced to indicate whether a tree's crown was fully captured within the scan limits, reflecting the precision of optical scanning boundaries. The dataset also precomputes morphological metrics such as tree height (derived from the 99th percentile of normalized vertical coordinates) and crown projection areas calculated from the intensity-based point cloud data.
Dataset Characteristics & Quality
TreeScanPL10K comprises 10,417 segmented trees, with species identified for approximately 72% of them through careful cross-referencing against ground truth. Optical scanning data proved robust across diverse forest stands in Poland, covering a range of species including conifers like Scots pine (Pinus sylvestris) and Norway spruce (Picea abies) as well as broadleaf trees such as European beech (Fagus sylvatica).
The intensity data from TLS enhanced differentiation of overlapping crowns and complex canopy structures, allowing manual annotation and AI training to better distinguish between species with similar spatial forms but distinct laser reflectance signatures.
Quality control revealed challenges inherent to TLS optical data, including ground-point contamination near tree bases and segmentation errors caused by crown overlaps. However, the fine spatial and intensity resolution allowed corrective manual edits to improve crown delineation and remove duplicate segments. Notably, conifers required slightly fewer corrections than broadleaves, likely due to their simpler crown architecture that produces more distinct laser return patterns.
Download the PDF of this page here
Analyses demonstrated that the fusion of multi-angle TLS scans produced dense, complete point clouds with minimal occlusion, critical for capturing the intricate geometry and laser reflectance characteristics needed for precision forestry. The dataset’s comprehensive optical information facilitates applications beyond species classification, including biomass estimation and monitoring of structural changes in forest stands.
Implications & Applications
This research presents TreeScanPL10K, a richly annotated TLS dataset providing millimeter-accurate 3D and laser-intensity data for over 10,000 individual trees representing Central European forest species. The optical scanning technology underlying the dataset captures not only precise tree geometry but also reflectance intensity, enabling nuanced species differentiation and biometric analysis that outperforms traditional remote sensing methods.
The dataset addresses the critical need for high-quality labeled TLS data and offers the potential to refine ecological understanding, enhance biodiversity monitoring, and improve forest management through better exploitation of optical data. Future work leveraging this dataset can further integrate optical attributes and deep learning to transform sustainable forest management globally.
Journal Reference
Sterenczak K., Kulicki M., et al. (2026). A Central European tree species dataset of annotated terrestrial laser scanning point clouds - TreeScanPL10K. Scientific Data. DOI: 10.1038/s41597-026-07269-1, https://www.nature.com/articles/s41597-026-07269-1