A thread-safe N-dimensional tree library in modern Fortran. Provides two concrete tree types a balanced kd-tree (KdTree) and a balanced ball-tree (BallTree) both with radius nearest-neighbour search, id-based lookup, physical node removal, and DBSCAN clustering. The NdTree abstract base type holds the shared pool infrastructure and is the scaffold for future tree types (OcTree, QuadTree).
All node counts, indices, and capacities are integer(int64) end to end, so a tree can theoretically hold up to int64 nodes. The int64 kind is re-exported from the module, so use NdTreeFortran brings it into scope for writing literals such as bufferSize=1000_int64.
v0.7.1 introduces
BallTree(with selectable distance metric) and widens the entire public integer API tointeger(int64).
use NdTreeFortran
use iso_fortran_env, only: real64
type(KdTree) :: t
real(real64) :: pts(3, 4) = reshape([...], [3, 4]) ! 4 points in 3D
call t%build(pts)Or a ball-tree, optionally choosing the distance metric before building:
type(BallTree) :: b
call b%setMetric('manhattan') ! optional; defaults to 'euclidean'
call b%build(pts)Both types share the same search, getter, and mutation API documented below; the only difference is BallTree's metric is fixed at build time (see [BallTree](#balltree)).
A 128-bit composite identifier returned by getNodeId() and getAllNodeIds(). node_id is stable for the lifetime of the node. pool_idx is a mutable hint to the node's current pool position; it may go stale after addNodes or rmvNodes.
type, bind(c) :: NodeId
integer(c_int64_t) :: node_id = 0 ! stable, globally unique per tree
integer(c_int64_t) :: pool_idx = 0 ! positional hint; may be stale
end type NodeIdCompare by node_id only:
if (a%node_id == b%node_id) ...A single tree node. Not directly constructable by the caller. Accessed via NdNodePtr%p after a search. Exposes:
getCoords()real(real64)(:)coordinatesgetData()polymorphic payload (if set at build time)getNodeId()type(NodeId)
Owned pointer to a deep-copied NdNode. Freed automatically when it goes out of scope (via final). Call %destroy() explicitly if you need early release.
type(NdNodePtr), allocatable :: res(:)
res = t%rNN_Centroid([0.0_real64, 0.0_real64], 1.0_real64)
! res(1)%p points to a heap-allocated copy
call res(1)%destroy() ! optional; finalizer runs on deallocation anywayContainer for the NdNodePtr array returned by multi-query searches and DBSCAN. %nodes is a zero-length array when the query has no match.
type :: NdNodeBucket
type(NdNodePtr), allocatable :: nodes(:)
end type NdNodeBucketThe shared pool infrastructure, all search functions, and all getters are defined on this abstract type. KdTree and BallTree both extend NdTree. Users declare type(KdTree) or type(BallTree) NdTree is the scaffold for future tree types.
The concrete kd-tree type. Extends NdTree and implements the balanced kd-tree build, rNN search, addNodes, and rmvNodes. Also exposes:
getSplitAxis(node)integer(int64), the split dimension of a given nodeprintTree([unit])prints the tree in pre-order; see format below
The concrete ball-tree type. Extends NdTree and implements a balanced ball-tree build, rNN search, addNodes, and rmvNodes. Unlike KdTree, a BallTree carries a single distance metric chosen at build time and used for both construction and every search. Exposes:
setMetric(metric)sets the metric ('euclidean','manhattan','chebyshev'); must be called beforebuildanderror stops if the tree is already initialized. Defaults to'euclidean'if never called.getMetric()character(len=9), the tree's metric, or'not init'if not builtgetBallRadius(node)real(real64), the bounding-ball radius of a node (leaves are0);error stops on an uninitialized treeprintTree([unit])prints the tree in pre-order with[r=<radius>]prefixes; see format below
Because the metric is fixed per tree, passing a metric argument to a BallTree search that disagrees with the tree's metric is an error.
Builds the tree from coordsList, a real(real64) array of shape [dim, N]. Rejects double-build without an intervening destroy. For a BallTree, call setMetric first if you want a non-default metric.
real(real64) :: pts(2, 5) = reshape([...], [2, 5])
call t%build(pts)
! With polymorphic payload
integer :: labels(5) = [10, 20, 30, 40, 50]
call t%build(pts, labels)
! With custom rebuild ratio
call t%build(pts, rebuildRatio=0.5_real64)| Parameter | Type | Default | Notes |
|---|---|---|---|
coordsList |
real(real64)(:,:) |
n/a | shape [dim, N] |
dataList |
class(*)(:) |
absent | polymorphic; length must equal N or 0 |
rebuildRatio |
real(real64) |
0.25 |
rebuild threshold; must be in (0, 1) |
Frees all internal state. The tree can be rebuilt after destruction.
Appends N nodes to an initialized tree. When modifications + N > rebuildRatio * pop, a full rebalance runs; otherwise nodes are inserted at leaves.
real(real64) :: extra(2, 3) = reshape([...], [2, 3])
call t%addNodes(extra)Physically removes nodes and rebuilds the tree. Returns the number of nodes removed.
Five dispatch branches:
| Arguments | Semantics |
|---|---|
coordsList only |
removes all within epsilon of each query point |
ids only |
removes by NodeId; O(n x size(ids)) |
coordsList + ids |
paired: removes node at coords(:,i) matching ids(i) |
coordsList + radii |
removes all within radii(i) of coords(:,i) |
coordsList + radii + ids |
spatial search then filters to nodes whose id is in ids |
type(NodeId) :: target(1)
target(1) = someNode%p%getNodeId()
n = t%rmvNodes(ids=target)
! By coordinate
real(real64) :: q(2, 1) = reshape([1.0_real64, 2.0_real64], [2, 1])
n = t%rmvNodes(coordsList=q)
! By radius
real(real64) :: qs(2, 2) = reshape([0.0_real64, 0.0_real64, 5.0_real64, 0.0_real64], [2, 2])
real(real64) :: rs(2) = [0.5_real64, 1.0_real64]
n = t%rmvNodes(coordsList=qs, radii=rs)| Parameter | Type | Default | Notes |
|---|---|---|---|
coordsList |
real(real64)(:,:) |
absent | required unless ids only |
radii |
real(real64)(:) |
absent | paired with coordsList; length must match |
ids |
type(NodeId)(:) |
absent | unordered set; obtain via getNodeId() |
epsilon |
real(real64) |
1e-15 |
coord-match tolerance when radii is absent |
metric |
character(*) |
'euclidean' |
valid metrics: 'euclidean', 'manhattan', 'chebyshev' |
bufferSize |
integer(int64) |
1000 |
initial rNN buffer capacity; must be > 0 |
All search functions are read-only and safe to call concurrently from multiple threads.
Finds all nodes within radius of a single coordinate point. The query point need not exist in the tree.
type(NdNodePtr), allocatable :: res(:)
res = t%rNN_Centroid([0.0_real64, 0.0_real64], 1.5_real64)
res = t%rNN_Centroid([0.0_real64, 0.0_real64], 1.5_real64, metric='manhattan')Finds all nodes within radius of an existing tree node. Pass excludeTarget=.true. to omit the query node itself.
type(NdNodePtr), allocatable :: centre(:), neighbours(:)
centre = t%rNN_Centroid([0.0_real64, 0.0_real64], 0.01_real64)
neighbours = t%rNN_Node(centre(1), 1.5_real64, excludeTarget=.true.)Batch coordinate search. coords is [dim, nQuery]; returns a parallel array of NdNodeBucket.
real(real64) :: q(2, 3) = reshape([...], [2, 3])
type(NdNodeBucket), allocatable :: res(:)
res = t%rNN_Coords(q, epsilon=0.5_real64)
! res(i)%nodes contains all matches for query i| Parameter | Type | Default | Notes |
|---|---|---|---|
coords |
real(real64)(:,:) |
n/a | shape [dim, nQuery] |
metric |
character(*) |
'euclidean' |
'euclidean', 'manhattan', 'chebyshev' |
epsilon |
real(real64) |
1e-15 |
search radius; must be >= 0 |
bufferSize |
integer(int64) |
1000 |
initial buffer; must be > 0 |
Like rNN_Coords, but ids(i) is an additional filter for query i. Returns a node only if it is within epsilon of coords(:,i) and its node_id equals ids(i)%node_id. ids is paired with coords column-by-column.
type(NodeId) :: target_ids(2)
target_ids(1) = nodeA%p%getNodeId()
target_ids(2) = nodeB%p%getNodeId()
res = t%rNN_Ids(q, target_ids, epsilon=1e-10_real64)Per-query variable-radius search. radii(i) is the search radius for coords(:,i).
real(real64) :: q(2, 2) = reshape([0.0_real64, 0.0_real64, 5.0_real64, 5.0_real64], [2, 2])
real(real64) :: r(2) = [1.5_real64, 3.0_real64]
res = t%rNN_Rad(q, r)Per-query variable-radius search, then filters to nodes whose node_id appears anywhere in ids. ids is an unordered set, not paired with coords.
type(NodeId) :: wanted(2)
wanted(1) = nodeA%p%getNodeId()
wanted(2) = nodeB%p%getNodeId()
res = t%rNN_RadIds(q, r, wanted)
! res(i)%nodes: nodes within r(i) of q(:,i) that appear in wantedLooks up nodes by NodeId. Uses pool_idx as an O(1) hint; falls back to O(n) scan only when the hint is stale (e.g. after rmvNodes). Returns a zero-length array when no match exists.
type(NodeId) :: ids(2)
type(NdNodePtr), allocatable :: res(:)
ids(1) = nodeA%p%getNodeId()
ids(2) = nodeB%p%getNodeId()
res = t%linScan(ids)Prefer coordinate-based search when you know where a node is. Use linScan when you have NodeId values but no coordinates.
Density-based spatial clustering. Returns res(1:nClusters+1): each res(i) for i <= nClusters holds one cluster, and res(nClusters+1) holds all noise nodes. Returns a zero-length array when the tree is empty.
type(NdNodeBucket), allocatable :: res(:)
integer :: nClusters, noiseCount, i
res = t%DBSCAN(minPts=3_int64, radius=0.5_real64)
nClusters = size(res) - 1
noiseCount = size(res(size(res))%nodes)
do i = 1, nClusters
print *, 'cluster', i, 'has', size(res(i)%nodes), 'nodes'
end do
print *, 'noise nodes:', noiseCount| Parameter | Type | Default | Notes |
|---|---|---|---|
minPts |
integer(int64) |
n/a | minimum neighbourhood size for a core point |
radius |
real(real64) |
n/a | neighbourhood search radius (epsilon) |
metric |
character(*) |
'euclidean' |
'euclidean', 'manhattan', 'chebyshev' |
bufferSize |
integer(int64) |
1000 |
initial rNN buffer capacity; must be > 0 |
Error guards (error stop): uninitialized tree; minPts < 0; radius < 0; unknown metric; bufferSize <= 0.
Border points: a point first visited as noise that is later reached as a seed by a core point is correctly reassigned to the cluster.
Prints the tree in pre-order (root, left subtree, right subtree). Each line is indented two spaces per depth level. The per-node prefix is tree-specific: KdTree prints [axis=N] (the split dimension), BallTree prints [r=<radius>] (the bounding-ball radius).
! 3 points in 2D: (1,2), (3,1), (5,4)
call t%printTree()KdTree output:
[axis=1] (3.000, 1.000)
[axis=2] (1.000, 2.000)
[axis=2] (5.000, 4.000)
BallTree output (radii depend on geometry):
[r=2.828] (3.000, 1.000)
[r=0.000] (1.000, 2.000)
[r=0.000] (5.000, 4.000)
unit defaults to stdout. Pass any open Fortran unit to redirect output.
Compares printTree output against an expected node set, matching as an unordered multiset of coordinate tuples (the per-node prefix and indentation are stripped). Use in tests to verify tree shape independent of tie-breaking order.
call t%assert('my_test', [ character(len=32) :: &
'[axis=1] (3.000, 1.000)', &
'[axis=2] (1.000, 2.000)', &
'[axis=2] (5.000, 4.000)' &
])Prints a diagnostic and calls stop 1 on mismatch.
| Function | Returns | Notes |
|---|---|---|
getPop() |
integer(int64) |
number of live nodes |
getDim() |
integer(int64) |
coordinate dimension |
getInitState(isInit) |
via argument | .true. after build, .false. after destroy |
getTreeId() |
integer(int64) |
unique id per build call |
getNumMods() |
integer(int64) |
insertions since last rebuild; 0 after rebuild |
getRebuildRatio() |
real(real64) |
current rebuild threshold |
setRebuildRatio(ratio) |
ratio must be in (0, 1) |
|
associatedNodePool(assoc) |
via argument | .true. when pool is allocated |
associatedRoot(assoc) |
via argument | .true. when root index is nonzero |
| Function | Returns | Notes |
|---|---|---|
getAllNodes() |
NdNodePtr(:) |
deep-copied array, length == pop; pool_idx pre-stamped |
getAllCoords() |
real(real64)(:,:) |
shape [dim, pop]; column i is pool position i |
getAllNodeIds() |
type(NodeId)(:) |
length pop; pool_idx accurate at call time |
On an NdNode accessed via NdNodePtr%p:
| Method | Returns | Notes |
|---|---|---|
getNodeId() |
type(NodeId) |
stable node_id; pool_idx at search |
getCoords() |
real(real64)(:) |
copy of node coordinates |
getData() |
class(*) |
polymorphic payload; set at build time |
Returns the splitting dimension for a given node in the kd-tree. Called on the tree, not the node:
type(NdNodePtr), allocatable :: res(:)
integer(int64) :: axis
res = t%rNN_Centroid([0.0_real64, 0.0_real64], 1.0_real64)
axis = t%getSplitAxis(res(1)%p) ! 1-based dimension indexReturns .true. if target belongs to this tree instance and has not been removed. Uses pool_idx from the node's NodeId as an O(1) hint; falls back to a full O(n) pool scan when the hint is stale (e.g. after rmvNodes compacted the pool). Updates the hint if O(n) call was triggered.
type(NdNodePtr), allocatable :: res(:)
res = t%rNN_Centroid([0.0_real64, 0.0_real64], 1.0_real64)
print *, t%isMember(res(1)) ! .true.| Operation | Concurrent-safe? |
|---|---|
Any rNN_*, linScan, DBSCAN, getAllNodes, getAllCoords, getAllNodeIds |
Yes read-only, no locking |
addNodes from multiple threads on one tree |
Yes serialized via !$OMP CRITICAL (tree_mutate) |
rmvNodes from multiple threads on one tree |
Yes search is read-only; compaction and rebuild are serialized |
rmvNodes concurrent with addNodes |
Yes same critical region |
| Same node removed by N threads concurrently | Yes keepMask re-checked inside critical; only 1 thread removes |
build concurrent with addNodes/rmvNodes |
No build is not guarded |
destroy concurrent with anything |
No |
type(KdTree) :: t
type(NdNodePtr), allocatable :: allNodes(:), neighbours(:)
real(real64) :: r = 1.5_real64
integer :: i
call t%build(pts)
allNodes = t%getAllNodes()
do i = 1, size(allNodes)
neighbours = t%rNN_Node(allNodes(i), r, excludeTarget=.true.)
! neighbours(:)%p -- the adjacent nodes to allNodes(i)
end dotype(NdNodeBucket), allocatable :: res(:)
integer :: nClusters, i, j
res = t%DBSCAN(minPts=4_int64, radius=0.5_real64)
nClusters = size(res) - 1
do i = 1, nClusters
do j = 1, size(res(i)%nodes)
print *, res(i)%nodes(j)%p%getCoords()
end do
end do
! Noise bucket is always last
do j = 1, size(res(nClusters+1)%nodes)
print *, 'noise:', res(nClusters+1)%nodes(j)%p%getCoords()
end dotype(KdTree) :: t
type(NdNodePtr), allocatable :: found(:)
type(NodeId) :: id(1)
type(NdNodeBucket), allocatable :: res(:)
integer(int64) :: numRmv
call t%build(pts)
found = t%rNN_Centroid([0.0_real64, 0.0_real64], 0.01_real64)
id(1) = found(1)%p%getNodeId()
res = t%rNN_RadIds(q, r, id)
if (size(res(1)%nodes) > 0) then
numRmv = t%rmvNodes(ids=id)
end iftype(KdTree) :: t
type(NdNodeBucket), allocatable :: res(:)
call t%build(pts)
!$OMP PARALLEL DEFAULT(NONE) SHARED(t) NUM_THREADS(4) PRIVATE(res)
res = t%DBSCAN(minPts=3_int64, radius=0.5_real64)
! Each thread gets its own result copy; tree is read-only
!$OMP END PARALLELDBSCAN results do not depend on tree population when only noise is added or removed outside cluster boundaries:
integer :: n1, n2
res = t%DBSCAN(minPts=3_int64, radius=0.5_real64)
n1 = size(res) - 1 ! nClusters
call t%addNodes(distant_pts) ! points too far to join any cluster
res = t%DBSCAN(minPts=3_int64, radius=0.5_real64)
n2 = size(res) - 1
! n1 == n2; cluster membership is unchanged| Constant | Value | Type |
|---|---|---|
DEFAULT_BUFFER_SIZE |
1000 |
integer(int64) |
DEFAULT_METRIC |
'euclidean' |
character(len=9) |
DEFAULT_EPSILON |
1e-15 |
real(real64) |