Flat Segment Tree

Overview

Flat segment tree is a derivative of segment tree, and is designed to store non-overlapping 1-dimensional range values such that the values of the neighboring ranges are guaranteed to be different. An insertion of a range value into this structure will always overwrite one or more existing ranges that overlap with the new range. If an insertion of a new range would cause any adjacent ranges to have the equal value, those ranges will be merged into one range.

An instance of this structure is initialized with fixed lower and upper bounaries, which will not change throughout the life time of the instance.

The flat segment tree structure consists of two parts: the leaf-node part which also forms a doubly-linked list, and the non-leaf-node part which forms a balanced-binary tree and is used only when performing tree-based queries. The range values are stored in the leaf-nodes, while the non-leaf nodes are used only for queries.

Quick start

This section demonstrates a simple use case of storing non-overlapping ranged values and performing queries using flat_segment_tree.

First, we need to instantiate a concrete type from the template:

using fst_type = mdds::flat_segment_tree<long, int>;

then create an instance of this type:

// Define the begin and end points of the whole segment, and the default
// value.
fst_type db(0, 500, 0);

Here, the first and second arguments specify the lower and upper boundaries of the whole segment. The third argument specifies the value for the empty segments. What this line does is to create a new instance and initializes it with one initial segment ranging from 0 to 500 with a value of 0:

_images/fst-example1-initial.svg

The new instance is initialized with an initial segment.

Internally, this initial range is represented by two leaf nodes, with the first one storing the start key and the value for the segment both of which happen to be 0 in this example, and the second one storing the end key of 500. Note that the end key of a segment is not inclusive.

The following lines insert two new segments into this structure:

db.insert_front(10, 20, 10);
db.insert_back(50, 70, 15);

The first line inserts a segment ranging from 10 to 20 with a value of 10, and the second line from 50 to 70 with a value of 15:

_images/fst-example1-insert1.svg

Two new segments have been inserted.

You can insert a new segment either via insert_front() or insert_back(). The end result will be the same regardless of which method you use; the difference is that insert_front() begins its search for the insertion point from the first node associated with the minimum key value, whereas insert_back() starts its search from the last node associated with the maximum key value.

After the insertions, the tree now contains a total of six leaf nodes to represent all stored segments. Note that one leaf node typically represents both the end of a segment and the start of the adjacent segment that comes after it, unless it’s either the first or the last node.

The next line inserts another segment ranging from 60 to 65 having a value of 5:

db.insert_back(60, 65, 5);

As this new segment overlaps with the existing segment of 50 to 70, it will cut into a middle part of that segment to make room for the new segment. At this point, the tree contains a total of eight leaf nodes representing seven segments:

_images/fst-example1-insert2.svg

A new segment has been inserted that overlaps an existing non-empty segment.

Next, we are going to query the value associated with a key of 15 via search():

// Perform linear search.  This doesn't require the tree to be built ahead
// of time.
if (auto it = db.search(15); it != db.end())
{
    auto segment = it.to_segment();
    std::cout << "The value at 15 is " << segment->value
        << ", and this segment spans from " << segment->start << " to "
        << segment->end << std::endl;
}

Executing this code will yield the following output:

The value at 15 is 10, and this segment spans from 10 to 20

One thing to note is that the search() method performs a linear search which involves traversing only through the leaf nodes in this data structure in order to find the target segment. As such, the worst-case lookup performance is directly proportional to the number of leaf nodes.

There is another way to perform the query with better worse-case performance, that is through search_tree() as seen in the following code:

// Don't forget to build tree before calling search_tree().
db.build_tree();

// Perform tree search.  Tree search is generally a lot faster than linear
// search, but requires the tree to be built ahead of time.
if (auto it = db.search_tree(62); it != db.end())
{
    auto segment = it.to_segment();
    std::cout << "The value at 62 is " << segment->value
        << ", and this segment spans from " << segment->start << " to "
        << segment->end << std::endl;
}

The signature of the search_tree() method is identical to that of the search() method except for the name. This code generates the following output:

The value at 62 is 5, and this segment spans from 60 to 65

Query via search_tree() generally performs better since it traverses through the search tree to find the target segment. But it does require the search tree to be built ahead of time by calling build_tree(). Please be aware that if the segments have been modified after the tree was last built, you will have to rebuild the tree by calling build_tree().

Warning

You need to build the tree by calling build_tree() before performing a tree-based search via search_tree(). If the segments have been modified after the tree was last built, you will have to rebuild the tree by calling build_tree() again.

Iterate through stored segments

flat_segment_tree supports two types of iterators to allow you to iterate through the segments stored in your tree. The first way is to iterate through the individual leaf nodes one at a time by using begin() and end():

for (auto it = db.begin(); it != db.end(); ++it)
{
    std::cout << "key: " << it->first << "; value: " << it->second << std::endl;
}

Each iterator value contains a pair of two values named first and second, with the first one being the key of the segment that the node initiates, and the second one being the value associated with that segment. When executing this code with the tree from the example code above, you’ll get the following output:

key: 0; value: 0
key: 10; value: 10
key: 20; value: 0
key: 50; value: 15
key: 60; value: 5
key: 65; value: 15
key: 70; value: 0
key: 500; value: 0

Each node stores the start key and the value of the segment it initiates, and the key stored in each node is also the end key of the segment that the previous node initiates except for the first node. Note that the value stored in the last node is currently not used. It is set to be the zero value of the value type, but this may change in the future.

One thing to keep in mind is that flat_segment_tree does not support mutable iterators that let you modify the stored keys or values.

Note

flat_segment_tree does not support mutable iterators; you can only traverse the values in a read-only fashion.

You can also use range-based for loop to iterate through the leaf nodes in a similar fashion:

for (const auto& node : db)
{
    std::cout << "key: " << node.first << "; value: " << node.second << std::endl;
}

The output from this code is identical to that from the previous one.

Now, one major inconvenience of navigating through the individual leaf nodes is that you need to manually keep track of the start and end points of each segment if you need to operate on the segments rather than the nodes that comprise the segments. The good news is that flat_segment_tree does provide a way to iterate through the segments directly as the following code demonstrates:

for (auto it = db.begin_segment(); it != db.end_segment(); ++it)
{
    std::cout << "start: " << it->start << "; end: " << it->end << "; value: " << it->value << std::endl;
}

This code uses begin_segment() and end_segment() to iterate through one segment at a time with the value of each iterator containing start, end and value members that correspond with the start key, end key and the value of the segment, respectively. Running this code produces the following output:

start: 0; end: 10; value: 0
start: 10; end: 20; value: 10
start: 20; end: 50; value: 0
start: 50; end: 60; value: 15
start: 60; end: 65; value: 5
start: 65; end: 70; value: 15
start: 70; end: 500; value: 0

It’s also possible to iterate through the segments in a range-based for loop, by calling segment_range():

for (const auto& segment : db.segment_range())
{
    std::cout << "start: " << segment.start << "; end: " << segment.end << "; value: " << segment.value << std::endl;
}

This code should generate output identical to that of the previous code.

API Reference

template<typename Key, typename Value>
class flat_segment_tree

Public Types

typedef Key key_type
typedef Value value_type
typedef size_t size_type
using node = st::detail::node<key_type, leaf_value_type>
using node_ptr = typename node::node_ptr
using nonleaf_node = st::detail::nonleaf_node<key_type, nonleaf_value_type>
using const_segment_iterator = mdds::fst::detail::const_segment_iterator<flat_segment_tree>

Public Functions

inline const_iterator begin() const

Return an iterator that points to the first leaf node that correspondes with the start position of the first segment.

Returns:

immutable iterator that points to the first leaf node that corresponds with the start position of the first segment.

inline const_iterator end() const

Return an iterator that points to the position past the last leaf node that corresponds with the end position of the last segment.

Returns:

immutable iterator that points to the position past last leaf node that corresponds with the end position of the last segment.

inline const_reverse_iterator rbegin() const

Return an iterator that points to the last leaf node that correspondes with the end position of the last segment. This iterator moves in the reverse direction of a normal iterator.

Returns:

immutable reverse iterator that points to the last leaf node that corresponds with the end position of the last segment.

inline const_reverse_iterator rend() const

Return an iterator that points to the position past the first leaf node that corresponds with the start position of the first segment. This iterator moves in the reverse direction of a normal iterator.

Returns:

immutable reverse iterator that points to the position past first leaf node that corresponds with the start position of the first segment.

const_segment_iterator begin_segment() const

Return an immutable iterator that points to the first segment stored in the tree. It iterates through the segments one segment at a time. Each iterator value consists of start, end, and value members that correspond with the start and end positions of a segment and the value of that segment, respectively.

Returns:

immutable iterator that points to the first segment stored in the tree.

const_segment_iterator end_segment() const

Return an immutable iterator that points to the position past the last segment stored in the tree. It iterates through the segments one segment at a time. Each iterator value consists of start, end, and value members that correspond with the start and end positions of a segment and the value of that segment, respectively.

Returns:

immutable iterator that points to the position past the last segment stored in the tree.

const_segment_range_type segment_range() const

Return a range object that provides a begin iterator and an end sentinel.

flat_segment_tree() = delete
flat_segment_tree(key_type min_val, key_type max_val, value_type init_val)

Constructor that takes minimum and maximum keys and the value to be used for the initial segment.

Parameters:
  • min_val – minimum allowed key value for the entire series of segments.

  • max_val – maximum allowed key value for the entires series of segments.

  • init_val – value to be used for the initial segment. This value will also be used for empty segments.

flat_segment_tree(const flat_segment_tree &r)

Copy constructor only copies the leaf nodes.

flat_segment_tree(flat_segment_tree &&other)

Move constructor.

Warning

The source instance will not be usable after the move construction.

~flat_segment_tree()
flat_segment_tree<Key, Value> &operator=(const flat_segment_tree &other)

Copy assignment operator.

Note

It only copies the leaf nodes.

Parameters:

other – Source instance to copy from.

flat_segment_tree<Key, Value> &operator=(flat_segment_tree &&other)

Move assignment operator.

Parameters:

other – Source instance to move from.

void swap(flat_segment_tree &other)

Swap the content of the tree with another instance.

Parameters:

other – instance of flat_segment_tree to swap content with.

void clear()

Remove all stored segments except for the initial segment. The minimum and maximum keys and the default value will be retained after the call returns. This call will also remove the tree.

inline std::pair<const_iterator, bool> insert_front(key_type start_key, key_type end_key, value_type val)

Insert a new segment into the tree. It searches for the point of insertion from the first leaf node.

Parameters:
  • start_key – start value of the segment being inserted. The value is inclusive.

  • end_key – end value of the segment being inserted. The value is not inclusive.

  • val – value associated with this segment.

Returns:

pair of const_iterator corresponding to the start position of the inserted segment, and a boolean value indicating whether or not the insertion has modified the tree.

inline std::pair<const_iterator, bool> insert_back(key_type start_key, key_type end_key, value_type val)

Insert a new segment into the tree. Unlike the insert_front() counterpart, this method searches for the point of insertion from the last leaf node toward the first.

Parameters:
  • start_key – start value of the segment being inserted. The value is inclusive.

  • end_key – end value of the segment being inserted. The value is not inclusive.

  • val – value associated with this segment.

Returns:

pair of const_iterator corresponding to the start position of the inserted segment, and a boolean value indicating whether or not the insertion has modified the tree.

std::pair<const_iterator, bool> insert(const const_iterator &pos, key_type start_key, key_type end_key, value_type val)

Insert a new segment into the tree at or after specified point of insertion.

Parameters:
  • pos – specified insertion point

  • start_key – start value of the segment being inserted. The value is inclusive.

  • end_key – end value of the segment being inserted. The value is not inclusive.

  • val – value associated with this segment.

Returns:

pair of const_iterator corresponding to the start position of the inserted segment, and a boolean value indicating whether or not the insertion has modified the tree.

void shift_left(key_type start_key, key_type end_key)

Remove a segment specified by the start and end key values, and shift the remaining segments (i.e. those segments that come after the removed segment) to left. Note that the start and end positions of the segment being removed must be within the base segment span.

Parameters:
  • start_key – start position of the segment being removed.

  • end_key – end position of the segment being removed.

void shift_right(key_type pos, key_type size, bool skip_start_node)

Shift all segments that occur at or after the specified start position to right by the size specified.

Parameters:
  • pos – position where the right-shift occurs.

  • size – amount of shift (must be greater than 0)

  • skip_start_node – if true, and the specified position is at an existing node position, that node will not be shifted. This argument has no effect if the position specified does not coincide with any of the existing nodes.

std::pair<const_iterator, bool> search(key_type key, value_type &value, key_type *start_key = nullptr, key_type *end_key = nullptr) const

Perform leaf-node search for a value associated with a key.

Parameters:
  • key – key value

  • value – value associated with key specified gets stored upon successful search.

  • start_key – pointer to a variable where the start key value of the segment that contains the key gets stored upon successful search.

  • end_key – pointer to a varaible where the end key value of the segment that contains the key gets stored upon successful search.

Returns:

a pair of const_iterator corresponding to the start position of the segment containing the key, and a boolean value indicating whether or not the search has been successful.

std::pair<const_iterator, bool> search(const const_iterator &pos, key_type key, value_type &value, key_type *start_key = nullptr, key_type *end_key = nullptr) const

Perform leaf-node search for a value associated with a key.

Parameters:
  • pos – position from which the search should start. When the position is invalid, it falls back to the normal search.

  • key – key value.

  • value – value associated with key specified gets stored upon successful search.

  • start_key – pointer to a variable where the start key value of the segment that contains the key gets stored upon successful search.

  • end_key – pointer to a varaible where the end key value of the segment that contains the key gets stored upon successful search.

Returns:

a pair of const_iterator corresponding to the start position of the segment containing the key, and a boolean value indicating whether or not the search has been successful.

const_iterator search(key_type key) const

Perform leaf-node search for a value associated with a key.

Parameters:

key – Key value to perform search for.

Returns:

Iterator position associated with the start position of the segment containing the key, or end iterator position upon search failure.

const_iterator search(const const_iterator &pos, key_type key) const

Perform leaf-node search for a value associated with a key.

Parameters:
  • pos – Position from which the search should start if valid. In case of an invalid position, it falls back to a search starting with the first position.

  • key – Key value to perform search for.

Returns:

Iterator position associated with the start position of the segment containing the key, or end iterator position if the search has failed.

std::pair<const_iterator, bool> search_tree(key_type key, value_type &value, key_type *start_key = nullptr, key_type *end_key = nullptr) const

Perform tree search for a value associated with a key. This method assumes that the tree is valid. Call is_tree_valid() to find out whether the tree is valid, and build_tree() to build a new tree in case it’s not.

Parameters:
  • key – key value

  • value – value associated with key specified gets stored upon successful search.

  • start_key – pointer to a variable where the start key value of the segment that contains the key gets stored upon successful search.

  • end_key – pointer to a varaible where the end key value of the segment that contains the key gets stored upon successful search.

Returns:

a pair of const_iterator corresponding to the start position of the segment containing the key, and a boolean value indicating whether or not the search has been successful.

const_iterator search_tree(key_type key) const

Perform tree search for a value associated with a key. The tree must be valid before performing the search, else the search will fail.

Parameters:

key – Key value to perform search for.

Returns:

Iterator position associated with the start position of the segment containing the key, or end iterator position if the search has failed.

void build_tree()

Build a tree of non-leaf nodes based on the values stored in the leaf nodes. The tree must be valid before you can call the search_tree() method.

inline bool valid_tree() const noexcept
Returns:

true if the tree is valid, otherwise false. The tree must be valid before you can call the search_tree() method.

bool operator==(const flat_segment_tree &other) const

Equality between two flat_segment_tree instances is evaluated by comparing the keys and the values of the leaf nodes only. Neither the non-leaf nodes nor the validity of the tree is evaluated.

inline bool operator!=(const flat_segment_tree &other) const
inline key_type min_key() const
inline key_type max_key() const
inline value_type default_value() const
size_type leaf_size() const

Return the number of leaf nodes.

Returns:

number of leaf nodes.

Friends

friend struct ::mdds::fst::detail::forward_itr_handler< flat_segment_tree >
friend struct ::mdds::fst::detail::reverse_itr_handler< flat_segment_tree >
class const_iterator : public mdds::fst::detail::const_iterator_base<flat_segment_tree, ::mdds::fst::detail::forward_itr_handler<flat_segment_tree>>

Public Functions

inline const_iterator()
const_segment_iterator to_segment() const

Create a segment iterator that references the same segment the source iterator references the start key of.

class const_reverse_iterator : public mdds::fst::detail::const_iterator_base<flat_segment_tree, ::mdds::fst::detail::reverse_itr_handler<flat_segment_tree>>

Public Functions

inline const_reverse_iterator()
class const_segment_range_type

Public Functions

const_segment_range_type(node_ptr left_leaf, node_ptr right_leaf)
const_segment_iterator begin() const
const_segment_iterator end() const
struct leaf_value_type

Public Functions

inline bool operator==(const leaf_value_type &r) const
inline bool operator!=(const leaf_value_type &r) const
inline leaf_value_type()

Public Members

value_type value
struct nonleaf_value_type