Trie Maps

Examples

Populating Trie Map

This section illustrates how to use trie_map to build a database of city populations and perform prefix searches. In this example, we will use the 2013 populations of cities in North Carolina, and use the city names as keys.

Let’s define the type first:

using trie_map_type = mdds::trie_map<mdds::trie::std_string_trait, int>;

The first template argument specifies the trait of the key. In this example, we are using a pre-defined trait for std::string, which is defined in std_string_trait. The second template argument specifies the value type, which in this example is simply an int.

Once the type is defined, the next step is instantiation:

trie_map_type nc_cities;

It’s pretty simple as you don’t need to pass any arguments to the constructor. Now, let’s populate this data structure with some population data:

// Insert key-value pairs.
nc_cities.insert("Charlotte",     792862);
nc_cities.insert("Raleigh",       431746);
nc_cities.insert("Greensboro",    279639);
nc_cities.insert("Durham",        245475);
nc_cities.insert("Winston-Salem", 236441);
nc_cities.insert("Fayetteville",  204408);
nc_cities.insert("Cary",          151088);
nc_cities.insert("Wilmington",    112067);
nc_cities.insert("High Point",    107741);
nc_cities.insert("Greenville",    89130);
nc_cities.insert("Asheville",     87236);
nc_cities.insert("Concord",       83506);
nc_cities.insert("Gastonia",      73209);
nc_cities.insert("Jacksonville",  69079);
nc_cities.insert("Chapel Hill",   59635);
nc_cities.insert("Rocky Mount",   56954);
nc_cities.insert("Burlington",    51510);
nc_cities.insert("Huntersville",  50458);
nc_cities.insert("Wilson",        49628);
nc_cities.insert("Kannapolis",    44359);
nc_cities.insert("Apex",          42214);
nc_cities.insert("Hickory",       40361);
nc_cities.insert("Goldsboro",     36306);

It’s pretty straight-forward. Each insert() call expects a pair of string key and an integer value. You can insert your data in any order regardless of key’s sort order.

Now that the data is in, let’s perform prefix search to query all cities whose name begins with “Cha”:

cout << "Cities that start with 'Cha' and their populations:" << endl;
auto results = nc_cities.prefix_search("Cha");
for (const auto& kv : results)
{
    cout << "  " << kv.first << ": " << kv.second << endl;
}

You can perform prefix search via prefix_search() method, which returns a results object that can be iterated over using a range-based for loop. Running this code will produce the following output:

Cities that start with 'Cha' and their populations:
  Chapel Hill: 59635
  Charlotte: 792862

Let’s perform another prefix search, this time with a prefix of “W”:

cout << "Cities that start with 'W' and their populations:" << endl;
results = nc_cities.prefix_search("W");
for (const auto& kv : results)
{
    cout << "  " << kv.first << ": " << kv.second << endl;
}

You’ll see the following output when running this code:

Cities that start with 'W' and their populations:
  Wilmington: 112067
  Wilson: 49628
  Winston-Salem: 236441

Note that the results are sorted in key’s ascending order.

Note

Results from the prefix search are sorted in key’s ascending order.

Creating Packed Trie Map from Trie Map

There is also another variant of trie called packed_trie_map which is designed to store all its data in contiguous memory region. Unlike trie_map which is mutable, packed_trie_map is immutable; once populated, you can only perform queries and it is no longer possible to add new entries into the container.

One way to create an instance of packed_trie_map is from trie_map by calling its pack() method:

auto packed = nc_cities.pack();

The query methods of packed_trie_map are identical to those of trie_map. For instance, performing prefix search to find all entries whose key begins with “C” can be done as follows:

cout << "Cities that start with 'C' and their populations:" << endl;
auto packed_results = packed.prefix_search("C");
for (const auto& kv : packed_results)
{
    cout << "  " << kv.first << ": " << kv.second << endl;
}

Running this code will generate the following output:

Cities that start with 'C' and their populations:
  Cary: 151088
  Chapel Hill: 59635
  Charlotte: 792862
  Concord: 83506

You can also perform an exact-match query via find() method which returns an iterator associated with the key-value pair entry:

// Individual search.
auto it = packed.find("Wilmington");
cout << "Population of Wilmington: " << it->second << endl;

You’ll see the following output with this code:

Population of Wilmington: 112067

What if you performed an exact-match query with a key that doesn’t exist in the container? You will basically get the end iterator position as its return value. Thus, running this code:

// You get an end position iterator when the container doesn't have the
// specified key.
it = packed.find("Asheboro");

cout << "Population of Asheboro: ";

if (it == packed.end())
    cout << "not found";
else
    cout << it->second;

cout << endl;

will generate the following output:

Population of Asheboro: not found

The complete source code for the examples in these two sections is available here.

Using Packed Trie Map directly

In the previous example, we showed a way to create an instance of packed_trie_map from a populated instance of trie_map. There is also a way to instantiate and populate an instance of packed_trie_map directly, and that is what we will cover in this section.

First, declare the type:

using trie_map_type = mdds::packed_trie_map<mdds::trie::std_string_trait, int>;

Once again, we are using the pre-defined trait for std::string as its key, and int as its value type. The next step is to prepare its entries ahead of time:

trie_map_type::entry entries[] =
{
    { MDDS_ASCII("Apex"),           42214 },
    { MDDS_ASCII("Asheville"),      87236 },
    { MDDS_ASCII("Burlington"),     51510 },
    { MDDS_ASCII("Cary"),          151088 },
    { MDDS_ASCII("Chapel Hill"),    59635 },
    { MDDS_ASCII("Charlotte"),     792862 },
    { MDDS_ASCII("Concord"),        83506 },
    { MDDS_ASCII("Durham"),        245475 },
    { MDDS_ASCII("Fayetteville"),  204408 },
    { MDDS_ASCII("Gastonia"),       73209 },
    { MDDS_ASCII("Goldsboro"),      36306 },
    { MDDS_ASCII("Greensboro"),    279639 },
    { MDDS_ASCII("Greenville"),     89130 },
    { MDDS_ASCII("Hickory"),        40361 },
    { MDDS_ASCII("High Point"),    107741 },
    { MDDS_ASCII("Huntersville"),   50458 },
    { MDDS_ASCII("Jacksonville"),   69079 },
    { MDDS_ASCII("Kannapolis"),     44359 },
    { MDDS_ASCII("Raleigh"),       431746 },
    { MDDS_ASCII("Rocky Mount"),    56954 },
    { MDDS_ASCII("Wilmington"),    112067 },
    { MDDS_ASCII("Wilson"),         49628 },
    { MDDS_ASCII("Winston-Salem"), 236441 },
};

We need to do this since packed_trie_map is immutable, and the only time we can populate its content is at instantiation time. Here, we are using the MDDS_ASCII macro to expand a string literal to its pointer value and size. Note that you need to ensure that the entries are sorted by the key in ascending order.

Warning

When instantiating packed_trie_map directly with a static set of entries, the entries must be sorted by the key in ascending order.

You can then pass this list of entries to construct the instance:

trie_map_type nc_cities(entries, MDDS_N_ELEMENTS(entries));

The MDDS_N_ELEMENTS macro will infer the size of a fixed-size array from its static definition. Once it’s instantiated, the rest of the example for performing searches will be the same as in the previous section, which we will not repeat here.

The complete source code for the example in this section is available here.

Saving and loading Packed Trie Map instances

There are times when you need to save the state of a packed_trie_map instance to a file, or an in-memory buffer, and load it back later. Doing that is now possible by using the save_state() and load_state() member methods of the packed_trie_map class.

First, let’s define the type of use:

using map_type = mdds::packed_trie_map<mdds::trie::std_string_trait, int>;

As with the previous examples, we will use std::string as the key type and int as the value type. In this example, we are going to use the world’s largest cities and their 2018 populations as the data to store in the container.

The following code defines the entries:

std::vector<map_type::entry> entries =
{
    { MDDS_ASCII("Ahmedabad"),        7681000  },
    { MDDS_ASCII("Alexandria"),       5086000  },
    { MDDS_ASCII("Atlanta"),          5572000  },
    { MDDS_ASCII("Baghdad"),          6812000  },
    { MDDS_ASCII("Bangalore"),        11440000 },
    { MDDS_ASCII("Bangkok"),          10156000 },
    { MDDS_ASCII("Barcelona"),        5494000  },
    { MDDS_ASCII("Beijing"),          19618000 },
    { MDDS_ASCII("Belo Horizonte"),   5972000  },
    { MDDS_ASCII("Bogota"),           10574000 },
    { MDDS_ASCII("Buenos Aires"),     14967000 },
    { MDDS_ASCII("Cairo"),            20076000 },
    { MDDS_ASCII("Chengdu"),          8813000  },
    { MDDS_ASCII("Chennai"),          10456000 },
    { MDDS_ASCII("Chicago"),          8864000  },
    { MDDS_ASCII("Chongqing"),        14838000 },
    { MDDS_ASCII("Dalian"),           5300000  },
    { MDDS_ASCII("Dallas"),           6099000  },
    { MDDS_ASCII("Dar es Salaam"),    6048000  },
    { MDDS_ASCII("Delhi"),            28514000 },
    { MDDS_ASCII("Dhaka"),            19578000 },
    { MDDS_ASCII("Dongguan"),         7360000  },
    { MDDS_ASCII("Foshan"),           7236000  },
    { MDDS_ASCII("Fukuoka"),          5551000  },
    { MDDS_ASCII("Guadalajara"),      5023000  },
    { MDDS_ASCII("Guangzhou"),        12638000 },
    { MDDS_ASCII("Hangzhou"),         7236000  },
    { MDDS_ASCII("Harbin"),           6115000  },
    { MDDS_ASCII("Ho Chi Minh City"), 8145000  },
    { MDDS_ASCII("Hong Kong"),        7429000  },
    { MDDS_ASCII("Houston"),          6115000  },
    { MDDS_ASCII("Hyderabad"),        9482000  },
    { MDDS_ASCII("Istanbul"),         14751000 },
    { MDDS_ASCII("Jakarta"),          10517000 },
    { MDDS_ASCII("Jinan"),            5052000  },
    { MDDS_ASCII("Johannesburg"),     5486000  },
    { MDDS_ASCII("Karachi"),          15400000 },
    { MDDS_ASCII("Khartoum"),         5534000  },
    { MDDS_ASCII("Kinshasa"),         13171000 },
    { MDDS_ASCII("Kolkata"),          14681000 },
    { MDDS_ASCII("Kuala Lumpur"),     7564000  },
    { MDDS_ASCII("Lagos"),            13463000 },
    { MDDS_ASCII("Lahore"),           11738000 },
    { MDDS_ASCII("Lima"),             10391000 },
    { MDDS_ASCII("London"),           9046000  },
    { MDDS_ASCII("Los Angeles"),      12458000 },
    { MDDS_ASCII("Luanda"),           7774000  },
    { MDDS_ASCII("Madrid"),           6497000  },
    { MDDS_ASCII("Manila"),           13482000 },
    { MDDS_ASCII("Mexico City"),      21581000 },
    { MDDS_ASCII("Miami"),            6036000  },
    { MDDS_ASCII("Moscow"),           12410000 },
    { MDDS_ASCII("Mumbai"),           19980000 },
    { MDDS_ASCII("Nagoya"),           9507000  },
    { MDDS_ASCII("Nanjing"),          8245000  },
    { MDDS_ASCII("New York City"),    18819000 },
    { MDDS_ASCII("Osaka"),            19281000 },
    { MDDS_ASCII("Paris"),            10901000 },
    { MDDS_ASCII("Philadelphia"),     5695000  },
    { MDDS_ASCII("Pune"),             6276000  },
    { MDDS_ASCII("Qingdao"),          5381000  },
    { MDDS_ASCII("Rio de Janeiro"),   13293000 },
    { MDDS_ASCII("Riyadh"),           6907000  },
    { MDDS_ASCII("Saint Petersburg"), 5383000  },
    { MDDS_ASCII("Santiago"),         6680000  },
    { MDDS_ASCII("Sao Paulo"),        21650000 },
    { MDDS_ASCII("Seoul"),            9963000  },
    { MDDS_ASCII("Shanghai"),         25582000 },
    { MDDS_ASCII("Shenyang"),         6921000  },
    { MDDS_ASCII("Shenzhen"),         11908000 },
    { MDDS_ASCII("Singapore"),        5792000  },
    { MDDS_ASCII("Surat"),            6564000  },
    { MDDS_ASCII("Suzhou"),           6339000  },
    { MDDS_ASCII("Tehran"),           8896000  },
    { MDDS_ASCII("Tianjin"),          13215000 },
    { MDDS_ASCII("Tokyo"),            37400068 },
    { MDDS_ASCII("Toronto"),          6082000  },
    { MDDS_ASCII("Washington, D.C."), 5207000  },
    { MDDS_ASCII("Wuhan"),            8176000  },
    { MDDS_ASCII("Xi'an"),            7444000  },
    { MDDS_ASCII("Yangon"),           5157000  },
};

It’s a bit long as it contains entries for 81 cities. We are then going to create an instance of the packed_trie_map class directly:

map_type cities(entries.data(), entries.size());

Let’s print the size of the container to make sure the container has been successfully populated:

cout << "Number of cities: " << cities.size() << endl;

You will see the following output:

Number of cities: 81

if the container has been successfully populated. Now, let’s run a prefix search on names beginning with an ‘S’:

cout << "Cities that begin with 'S':" << endl;
auto results = cities.prefix_search("S");
for (const auto& city : results)
    cout << "  * " << city.first << ": " << city.second << endl;

to make sure you get the following ten cities and their populations as the output:

Cities that begin with 'S':
  * Saint Petersburg: 5383000
  * Santiago: 6680000
  * Sao Paulo: 21650000
  * Seoul: 9963000
  * Shanghai: 25582000
  * Shenyang: 6921000
  * Shenzhen: 11908000
  * Singapore: 5792000
  * Surat: 6564000
  * Suzhou: 6339000

So far so good. Next, we will use the save_state() method to dump the internal state of this container to a file named cities.bin:

std::ofstream outfile("cities.bin", ios::binary);
cities.save_state(outfile);

This will create a file named cities.bin which contains a binary blob representing the content of this container in the current working directory. Run the ls -l cities.bin command to make sure the file has been created:

-rw-r--r-- 1 kohei kohei 17713 Jun 20 12:49 cities.bin

Now that the state of the container has been fully serialized to a file, let’s work on restoring its content in another, brand-new instance of packed_trie_map.

map_type cities_loaded;

std::ifstream infile("cities.bin", ios::binary);
cities_loaded.load_state(infile);

Here, we used the load_state() method to restore the state from the file we have previously created. Let’s make sure that this new instance has content equivalent to that of the original:

cout << "Equal to the original? " << std::boolalpha << (cities == cities_loaded) << endl;

If you see the following output:

Equal to the original? true

then this new instance has equivalent contant as the original one. Let’s also make sure that it contains the same number of entries as the original:

cout << "Number of cities: " << cities_loaded.size() << endl;

Hopefully you will see the following output:

Number of cities: 81

Lastly, let’s run on this new instance the same prefix search we did on the original instance, to make sure we still get the same results:

cout << "Cities that begin with 'S':" << endl;
auto results = cities_loaded.prefix_search("S");
for (const auto& city : results)
    cout << "  * " << city.first << ": " << city.second << endl;

You should see the following output:

Cities that begin with 'S':
  * Saint Petersburg: 5383000
  * Santiago: 6680000
  * Sao Paulo: 21650000
  * Seoul: 9963000
  * Shanghai: 25582000
  * Shenyang: 6921000
  * Shenzhen: 11908000
  * Singapore: 5792000
  * Surat: 6564000
  * Suzhou: 6339000

which is the same output we saw in the first prefix search.

The complete source code for this example is found here.

Saving Packed Trie Map with custom value type

In the previos example, you didn’t have to explicitly specify the serializer type to the save_state() and load_state() methods, even though these two methods require the serializer type as their template arguments. That’s because the library provides default serializer types for

  • numeric value types i.e. integers, float and double,

  • std::string, and

  • the standard sequence types, such as std::vector, whose elements are of numeric value types,

and the previous example used int as the value type.

In this section, we are going to illustrate how you can write your own custom serializer to allow serialization of your own custom value type. In this example, we are going to use the list of presidents of the United States, with the names of the presidents as the keys, and their years of inauguration and political affiliations as the values.

We will use the following structure to store the values:

enum affiliated_party_t : uint8_t
{
    unaffiliated = 0,
    federalist,
    democratic_republican,
    democratic,
    whig,
    republican,
    national_union,
    republican_national_union,
};

struct us_president
{
    uint16_t year;
    affiliated_party_t party;
};

Each entry stores the year as a 16-bit integer and the affiliated party as an enum value of 8-bit width.

Next, let’s define the container type:

using map_type = mdds::packed_trie_map<mdds::trie::std_string_trait, us_president>;

As with the previous example, the first step is to define the entries that are sorted by the keys, which in this case are the president’s names:

std::vector<map_type::entry> entries =
{
    { MDDS_ASCII("Abraham Lincoln"),        { 1861, republican_national_union } },
    { MDDS_ASCII("Andrew Jackson"),         { 1829, democratic                } },
    { MDDS_ASCII("Andrew Johnson"),         { 1865, national_union            } },
    { MDDS_ASCII("Barack Obama"),           { 2009, democratic                } },
    { MDDS_ASCII("Benjamin Harrison"),      { 1889, republican                } },
    { MDDS_ASCII("Bill Clinton"),           { 1993, democratic                } },
    { MDDS_ASCII("Calvin Coolidge"),        { 1923, republican                } },
    { MDDS_ASCII("Chester A. Arthur"),      { 1881, republican                } },
    { MDDS_ASCII("Donald Trump"),           { 2017, republican                } },
    { MDDS_ASCII("Dwight D. Eisenhower"),   { 1953, republican                } },
    { MDDS_ASCII("Franklin D. Roosevelt"),  { 1933, democratic                } },
    { MDDS_ASCII("Franklin Pierce"),        { 1853, democratic                } },
    { MDDS_ASCII("George H. W. Bush"),      { 1989, republican                } },
    { MDDS_ASCII("George W. Bush"),         { 2001, republican                } },
    { MDDS_ASCII("George Washington"),      { 1789, unaffiliated              } },
    { MDDS_ASCII("Gerald Ford"),            { 1974, republican                } },
    { MDDS_ASCII("Grover Cleveland 1"),     { 1885, democratic                } },
    { MDDS_ASCII("Grover Cleveland 2"),     { 1893, democratic                } },
    { MDDS_ASCII("Harry S. Truman"),        { 1945, democratic                } },
    { MDDS_ASCII("Herbert Hoover"),         { 1929, republican                } },
    { MDDS_ASCII("James A. Garfield"),      { 1881, republican                } },
    { MDDS_ASCII("James Buchanan"),         { 1857, democratic                } },
    { MDDS_ASCII("James K. Polk"),          { 1845, democratic                } },
    { MDDS_ASCII("James Madison"),          { 1809, democratic_republican     } },
    { MDDS_ASCII("James Monroe"),           { 1817, democratic_republican     } },
    { MDDS_ASCII("Jimmy Carter"),           { 1977, democratic                } },
    { MDDS_ASCII("John Adams"),             { 1797, federalist                } },
    { MDDS_ASCII("John F. Kennedy"),        { 1961, democratic                } },
    { MDDS_ASCII("John Quincy Adams"),      { 1825, democratic_republican     } },
    { MDDS_ASCII("John Tyler"),             { 1841, whig                      } },
    { MDDS_ASCII("Lyndon B. Johnson"),      { 1963, democratic                } },
    { MDDS_ASCII("Martin Van Buren"),       { 1837, democratic                } },
    { MDDS_ASCII("Millard Fillmore"),       { 1850, whig                      } },
    { MDDS_ASCII("Richard Nixon"),          { 1969, republican                } },
    { MDDS_ASCII("Ronald Reagan"),          { 1981, republican                } },
    { MDDS_ASCII("Rutherford B. Hayes"),    { 1877, republican                } },
    { MDDS_ASCII("Theodore Roosevelt"),     { 1901, republican                } },
    { MDDS_ASCII("Thomas Jefferson"),       { 1801, democratic_republican     } },
    { MDDS_ASCII("Ulysses S. Grant"),       { 1869, republican                } },
    { MDDS_ASCII("Warren G. Harding"),      { 1921, republican                } },
    { MDDS_ASCII("William Henry Harrison"), { 1841, whig                      } },
    { MDDS_ASCII("William Howard Taft"),    { 1909, republican                } },
    { MDDS_ASCII("William McKinley"),       { 1897, republican                } },
    { MDDS_ASCII("Woodrow Wilson"),         { 1913, democratic                } },
    { MDDS_ASCII("Zachary Taylor"),         { 1849, whig                      } },
};

Note that we need to add numeric suffixes to the entries for Grover Cleveland, who became president twice in two separate periods, in order to make the keys for his entries unique.

Now, proceed to create an instance of packed_trie_map:

map_type us_presidents(entries.data(), entries.size());

and inspect its size to make sure it is instantiated properly:

cout << "Number of entries: " << us_presidents.size() << endl;

You should see the following output:

Number of entries: 45

Before we proceed to save the state of this instance, let’s define the custom serializer type first:

struct us_president_serializer
{
    union bin_buffer
    {
        char buffer[2];
        uint16_t i16;
        affiliated_party_t party;
    };

    static constexpr bool variable_size = false;
    static constexpr size_t value_size = 3;

    static void write(std::ostream& os, const us_president& v)
    {
        bin_buffer buf;

        // Write the year value first.
        buf.i16 = v.year;
        os.write(buf.buffer, 2);

        // Write the affiliated party value.
        buf.party = v.party;
        os.write(buf.buffer, 1);
    }

    static void read(std::istream& is, size_t n, us_president& v)
    {
        // For a fixed-size value type, this should equal the defined value size.
        assert(n == 3);

        bin_buffer buf;

        // Read the year value.
        is.read(buf.buffer, 2);
        v.year = buf.i16;

        // Read the affiliated party value.
        is.read(buf.buffer, 1);
        v.party = buf.party;
    }
};

A custom value type can be either variable-size or fixed-size. For a variable-size value type, each value segment is preceded by the byte length of that segment. For a fixed-size value type, the byte length of all of the value segments is written only once up-front, followed by one or more value segments of equal byte length.

Since the value type in this example is fixed-size, we set the value of the variable_size static constant to false, and define the size of the value to 3 (bytes) as the value_size static constant. Keep in mind that you need to define the value_size constant only for fixed-size value types; if your value type is variable-size, you can leave it out.

Additionally, you need to define two static methods - one for writing to the output stream, and one for reading from the input stream. The write method must have the following signature:

static void write(std::ostream& os, const T& v);

where the T is the value type. In the body of this method you write to the output stream the bytes that represent the value. The length of the bytes you write must match the size specified by the value_size constant.

The read method must have the following signature:

static void read(std::istream& is, size_t n, T& v);

where the T is the value type, and the n specifies the length of the bytes you need to read for the value. For a fixed-size value type, the value of n should equal the value_size constant. Your job is to read the bytes off of the input stream for the length specified by the n, and populate the value instance passed to the method as the third argument.

Now that we have defined the custom serializer type, let’s proceed to save the state to a file:

std::ofstream outfile("us-presidents.bin", ios::binary);
us_presidents.save_state<us_president_serializer>(outfile);

This time around, we are specifying the serializer type explicitly as the template argument to the save_state() method. Otherwise it is no different than what we did in the previous example.

Let’s create another instance of packed_trie_map and restore the state back from the file we just created:

map_type us_presidents_loaded;

std::ifstream infile("us-presidents.bin", ios::binary);
us_presidents_loaded.load_state<us_president_serializer>(infile);

Once again, aside from explicitly specifying the serializer type as the template argument to the load_state() method, it is identical to the way we did in the previous example.

Let’s compare the new instance with the old one to see if the two are equal:

cout << "Equal to the original? " << std::boolalpha << (us_presidents == us_presidents_loaded) << endl;

The output says:

Equal to the original? true

They are. While we are at it, let’s run a simple prefix search to find out all the US presidents whose first name is ‘John’:

cout << "Presidents whose first name is 'John':" << endl;
auto results = us_presidents_loaded.prefix_search("John");
for (const auto& entry : results)
    cout << "  * " << entry.first << " (" << entry.second.year << "; " << entry.second.party << ")" << endl;

Here is the output:

Presidents whose first name is 'John':
  * John Adams (1797; Federalist)
  * John F. Kennedy (1961; Democratic)
  * John Quincy Adams (1825; Democratic Republican)
  * John Tyler (1841; Whig)

This looks like the correct results!

You can find the complete source code for this example here.

API Reference

Trie Map

template<typename _KeyTrait, typename _ValueT>
class mdds::trie_map

Trie map provides storage for multiple key-value pairs where keys are either strings, or otherwise consist of arrays of characters. The keys are stored in an ordered tree structure known as trie, or sometimes referred to as prefix tree.

Public Types

typedef packed_trie_map<_KeyTrait, _ValueT> packed_type
typedef _KeyTrait key_trait_type
typedef key_trait_type::key_type key_type
typedef key_trait_type::key_buffer_type key_buffer_type
typedef key_trait_type::key_unit_type key_unit_type
typedef _ValueT value_type
typedef size_t size_type
typedef std::pair<key_type, value_type> key_value_type
using const_iterator = trie::detail::const_iterator<trie_map>
using iterator = trie::detail::iterator<trie_map>
typedef trie::detail::search_results<trie_map> search_results

Public Functions

trie_map()

Default constructor.

trie_map(const trie_map &other)
trie_map(trie_map &&other)
const_iterator begin() const
iterator begin()
const_iterator end() const
iterator end()
trie_map &operator=(trie_map other)
void swap(trie_map &other)
void insert(const key_type &key, const value_type &value)

Insert a new key-value pair.

Parameters
  • key: key value.

  • value: value to associate with the key.

void insert(const key_unit_type *key, size_type len, const value_type &value)

Insert a new key-value pair.

Parameters
  • key: pointer to the first character of a character array that stores key value.

  • len: length of the character array storing the key.

  • value: value to associate with the key.

bool erase(const key_unit_type *key, size_type len)

Erase a key and the value associated with it.

Return

true if a key is erased, false otherwise.

Parameters
  • key: pointer to the first character of a character array that stores key value.

  • len: length of the character array storing the key.

const_iterator find(const key_type &key) const

Find a value associated with a specified key.

Return

immutable iterator that references a value associated with the key, or the end position in case the key is not found.

Parameters
  • key: key to match.

const_iterator find(const key_unit_type *input, size_type len) const

Find a value associated with a specified key.

Return

immutable iterator that references a value associated with the key, or the end position in case the key is not found.

Parameters
  • input: pointer to an array whose value represents the key to match.

  • len: length of the matching key value.

iterator find(const key_type &key)

Find a value associated with a specified key.

Return

mutable iterator that references a value associated with the key, or the end position in case the key is not found.

Parameters
  • key: key to match.

iterator find(const key_unit_type *input, size_type len)

Find a value associated with a specified key.

Return

mutable iterator that references a value associated with the key, or the end position in case the key is not found.

Parameters
  • input: pointer to an array whose value represents the key to match.

  • len: length of the matching key value.

search_results prefix_search(const key_type &prefix) const

Retrieve all key-value pairs whose keys start with specified prefix. You can also retrieve all key-value pairs by passing a null prefix and a length of zero.

Return

results object that contains all matching key-value pairs. The results are sorted by the key in ascending order.

Parameters
  • prefix: prefix to match.

search_results prefix_search(const key_unit_type *prefix, size_type len) const

Retrieve all key-value pairs whose keys start with specified prefix. You can also retrieve all key-value pairs by passing a null prefix and a length of zero.

Return

results object that contains all matching key-value pairs. The results are sorted by the key in ascending order.

Parameters
  • prefix: pointer to an array whose value represents the prefix to match.

  • len: length of the prefix value to match.

size_type size() const

Return the number of entries in the map.

Return

the number of entries in the map.

bool empty() const noexcept
void clear()

Empty the container.

packed_type pack() const

Create a compressed and immutable version of the container which provides better search performance and requires much less memory footprint.

Return

an instance of mdds::packed_trie_map with the same content.

Friends

friend class trie::detail::iterator_base< trie_map, true >
friend class trie::detail::iterator_base< trie_map, false >
friend class trie::detail::const_iterator< trie_map >
friend class trie::detail::iterator< trie_map >
friend class trie::detail::search_results< trie_map >

Packed Trie Map

template<typename _KeyTrait, typename _ValueT>
class mdds::packed_trie_map

An immutable trie container that packs its content into a contiguous array to achieve both space efficiency and lookup performance. The user of this data structure must provide a pre-constructed list of key-value entries that are sorted by the key in ascending order, or construct from an instance of mdds::trie_map.

Note that, since this container is immutable, the content of the container cannot be modified once constructed.

Public Types

typedef _KeyTrait key_trait_type
typedef key_trait_type::key_type key_type
typedef key_trait_type::key_buffer_type key_buffer_type
typedef key_trait_type::key_unit_type key_unit_type
typedef _ValueT value_type
typedef size_t size_type
typedef std::pair<key_type, value_type> key_value_type
typedef trie::detail::packed_iterator_base<packed_trie_map> const_iterator
typedef trie::detail::packed_search_results<packed_trie_map> search_results

Public Functions

packed_trie_map()
packed_trie_map(const entry *entries, size_type entry_size)

Constructor that initializes the content from a static list of key-value entries. The caller must ensure that the key-value entries are sorted in ascending order, else the behavior is undefined.

Parameters
  • entries: pointer to the array of key-value entries.

  • entry_size: size of the key-value entry array.

  • null_value: null value to return when the find method fails to find a matching entry.

packed_trie_map(const trie_map<key_trait_type, value_type> &other)

Constructor to allow construction of an instance from the content of a mdds::trie_map instance.

Parameters

packed_trie_map(const packed_trie_map &other)
packed_trie_map(packed_trie_map &&other)
packed_trie_map &operator=(packed_trie_map other)
bool operator==(const packed_trie_map &other) const
bool operator!=(const packed_trie_map &other) const
const_iterator begin() const
const_iterator end() const
const_iterator cbegin() const
const_iterator cend() const
const_iterator find(const key_type &key) const

Find a value associated with a specified key.

Return

iterator that references a value associated with the key, or the end position in case the key is not found.

Parameters
  • key: key to match.

const_iterator find(const key_unit_type *input, size_type len) const

Find a value associated with a specified key.

Return

iterator that references a value associated with the key, or the end position in case the key is not found.

Parameters
  • input: pointer to an array whose value represents the key to match.

  • len: length of the matching key value.

search_results prefix_search(const key_type &prefix) const

Retrieve all key-value pairs whose keys start with specified prefix. You can also retrieve all key-value pairs by passing a null prefix and a length of zero.

Return

results object containing all matching key-value pairs.

Parameters
  • prefix: prefix to match.

search_results prefix_search(const key_unit_type *prefix, size_type len) const

Retrieve all key-value pairs whose keys start with specified prefix. You can also retrieve all key-value pairs by passing a null prefix and a length of zero.

Return

results object that contains all matching key-value pairs. The results are sorted by the key in ascending order.

Parameters
  • prefix: pointer to an array whose value represents the prefix to match.

  • len: length of the prefix value to match.

size_type size() const noexcept

Return the number of entries in the map.

Return

the number of entries in the map.

bool empty() const noexcept
void swap(packed_trie_map &other)
template<typename _Func = trie::value_serializer<value_type>>
void save_state(std::ostream &os) const

Save the state of the instance of this class to an external buffer.

Parameters
  • os: output stream to write the state to.

template<typename _Func = trie::value_serializer<value_type>>
void load_state(std::istream &is)

Restore the state of the instance of this class from an external buffer.

Parameters
  • is: input stream to load the state from.

void dump_structure() const

Dump the structure of the trie content for debugging. You must define MDDS_TRIE_MAP_DEBUG in order for this method to generate output.

Friends

friend class trie::detail::packed_iterator_base< packed_trie_map >
friend class trie::detail::packed_search_results< packed_trie_map >
struct entry

Single key-value entry. Caller must provide at compile time a static array of these entries.

Public Functions

entry(const key_unit_type *_key, size_type _keylen, value_type _value)

Public Members

const key_unit_type *key
size_type keylen
value_type value

Traits

template<typename ContainerT>
struct mdds::trie::std_container_trait

Template for a key type implemented using a typical STL container type.

Public Types

using key_type = ContainerT

type used to store a key value.

using key_buffer_type = key_type

type used to build an intermediate key value, from which a final key value is to be created. It is expected to be an array structure whose content is contiguous in memory. Its elements must be of key_unit_type.

using key_unit_type = typename key_type::value_type

type that represents a single character inside a key or a key buffer object. A key object is expected to store a series of elements of this type.

Public Static Functions

key_buffer_type to_key_buffer(const key_unit_type *str, size_t length)

Function called to create and initialize a buffer object from a given initial key value.

Return

buffer object containing the specified key value.

Parameters
  • str: pointer to the first character of the key value.

  • length: length of the key value.

key_buffer_type to_key_buffer(const key_type &key)

Function called to create and initialize a buffer object from a given initial key value.

Return

buffer object containing the specified key value.

Parameters
  • key: key value

const key_unit_type *buffer_data(const key_buffer_type &buf)
size_t buffer_size(const key_buffer_type &buf)
void push_back(key_buffer_type &buffer, key_unit_type c)

Function called to append a single character to the end of a key buffer.

Parameters
  • buffer: buffer object to append character to.

  • c: character to append to the buffer.

void pop_back(key_buffer_type &buffer)

Function called to remove a single character from the tail of an existing key buffer.

Parameters
  • buffer: buffer object to remove character from.

key_type to_key(const key_buffer_type &buf)

Function called to create a final string object from an existing buffer.

Return

string object whose content is created from the buffer object.

Parameters
  • buf: buffer object to create a final string object from.

using mdds::trie::std_string_trait = std_container_trait<std::string>

Value Serializers

template<typename T, typename U = void>
struct value_serializer : public mdds::trie::numeric_value_serializer<T>

Default value serializer for mdds::packed_trie_map’s load_state and save_state methods. The primary template is used for numeric value types, and template specializations exist for std::string, as well as sequence containers, such as std::vector, whose elements are of numeric types.

template<typename T>
struct mdds::trie::numeric_value_serializer

Serializer for numeric data types.

Subclassed by mdds::trie::value_serializer< T, U >

Public Static Functions

void write(std::ostream &os, const T &v)
void read(std::istream &is, size_t n, T &v)

Public Static Attributes

constexpr bool variable_size = false
constexpr size_t value_size = sizeof(T)
template<typename T>
struct mdds::trie::variable_value_serializer

Serializer for variable-size data types.

Public Static Functions

void write(std::ostream &os, const T &v)
void read(std::istream &is, size_t n, T &v)

Public Static Attributes

constexpr bool variable_size = true
template<typename T>
struct mdds::trie::numeric_sequence_value_serializer

Serializer for standard sequence container whose value type is of numeric value type.

Subclassed by mdds::trie::value_serializer< T, typename std::enable_if< has_value_type< T >::value >::type >

Public Types

using element_serializer = numeric_value_serializer<typename T::value_type>

Public Static Functions

void write(std::ostream &os, const T &v)
void read(std::istream &is, size_t n, T &v)

Public Static Attributes

constexpr bool variable_size = true