Why does the C++ STL not provide any "tree" containers?

D

Deduplicator

There are two reasons you could want to use a tree:

You want to mirror the problem using a tree-like structure:
For this we have boost graph library

Or you want a container that has tree like access characteristics For this we have

std::map (and std::multimap)

std::set (and std::multiset)

Basically the characteristics of these two containers is such that they practically have to be implemented using trees (though this is not actually a requirement).

See also this question: C tree Implementation

There are many, many reasons to use a tree, even if these are the most common. Most common !equal all.

A third major reason to want a tree is for an always-sorted list with fast insertion/removal, but for that there is std:multiset.

@Durga: Not sure how the depth is relevant when you are using map as a sorted container. Map guarantees log(n) insertion/deletion/lookup (and containing elements in sorted order). This is all map is used for and is implemented (usually) as a red/black tree. A red/black tree makes sure that the tree is balanced. So the depth of the tree is directly related to the number of elements in the tree.

I disagree with this answer, both in 2008 and now. The standard library does not "have" boost, and the availability of something in boost should not be (and has not been) a reason not to adopt it into the standard. Additionally, the BGL is general and involved enough to merit specialized tree classes independent from it. Also, the fact that std::map and std::set require a tree is, IMO, another argument for having an stl::red_black_tree etc. Finally, the std::map and std::set trees are balanced, an std::tree might not be.

@einpoklum : "the availability of something in boost should not be a reason not to adopt it into the standard" - given one of the purposes of boost is to act as a proving ground for useful libraries before incorporation in the standard, I can only say "absolutely!".

p

phuclv

Probably for the same reason that there is no tree container in boost. There are many ways to implement such a container, and there is no good way to satisfy everyone who would use it.

Some issues to consider:

Are the number of children for a node fixed or variable?

How much overhead per node? - ie, do you need parent pointers, sibling pointers, etc.

What algorithms to provide? - different iterators, search algorithms, etc.

In the end, the problem ends up being that a tree container that would be useful enough to everyone, would be too heavyweight to satisfy most of the people using it. If you are looking for something powerful, Boost Graph Library is essentially a superset of what a tree library could be used for.

Here are some other generic tree implementations:

Kasper Peeters' tree.hh

Adobe's forest

core::tree

"...no good way to satisfy everyone..." Except that since stl::map, stl::multimap, and stl::set are based on stl's rb_tree, it should satisfy just as many cases as those basic types do.

Considering there's no way to retrieve the children of a node of a std::map, I wouldn't call those tree containers. Those are associative containers that are commonly implemented as trees. Big difference.

I agree with Mooing Duck, how would you implement a breadth first search on a std::map? It's going to be terribly expensive

I started using Kasper Peeters' tree.hh, however after reviewing the licensing for GPLv3, or any other GPL version, it would contaminate our commercial software. I would recommend looking at treetree provided in the comment by @hplbsh if you need a structure for commercial purposes.

The variety specific requirements on trees is an argument to have different types of trees, not to have none at all.

w

wilhelmtell

The STL's philosophy is that you choose a container based on guarantees and not based on how the container is implemented. For example, your choice of container may be based on a need for fast lookups. For all you care, the container may be implemented as a unidirectional list -- as long as searching is very fast you'd be happy. That's because you're not touching the internals anyhow, you're using iterators or member functions for the access. Your code is not bound to how the container is implemented but to how fast it is, or whether it has a fixed and defined ordering, or whether it is efficient on space, and so on.

I don't think he's talking about container implementations, he's talking about an actual tree container itself.

@MooingDuck I think what wilhelmtell means is that the C++ standard library doesn't define containers based on their underlying data structure; it only defines containers by their interface and observable characteristics like asymptotic performance. When you think about it, a tree isn't really a container (as we know them) at all. They don't even have a a straight forward end() and begin() with which you can iterate through all elements, etc.

@JordanMelo: Nonsense on all points. It's a thing that contains objects. It's very trivial to design it to have a begin() and end() and bidirectional iterators to iterate with. Each container has different characteristics. It would be useful if one could additionally have tree characteristics. Should be pretty easy.

Thus one wants to have a container that provides fast lookups for child and parent nodes, and reasonable memory requirements.

@JordanMelo: From that perspective, also adapters like queues, stacks or priority queues would not belong to the STL (they also do not have begin() and end()). And remember that a priority queue is typically a heap, which at least in theory is a tree (even though actual implementations). So even if you implemented a tree as an adapter using some different underlying data structure, it would be eligible to be included in the STL.

B

Brent Bradburn

"I want to store a hierarchy of objects as a tree"

C++11 has come and gone and they still didn't see a need to provide a std::tree, although the idea did come up (see here). Maybe the reason they haven't added this is that it is trivially easy to build your own on top of the existing containers. For example...

template< typename T >
struct tree_node
   {
   T t;
   std::vector<tree_node> children;
   };

A simple traversal would use recursion...

template< typename T >
void tree_node<T>::walk_depth_first() const
   {
   cout<<t;
   for ( auto & n: children ) n.walk_depth_first();
   }

If you want to maintain a hierarchy and you want it to work with STL algorithms, then things may get complicated. You could build your own iterators and achieve some compatibility, however many of the algorithms simply don't make any sense for a hierarchy (anything that changes the order of a range, for example). Even defining a range within a hierarchy could be a messy business.

If the project can allow for the the children of a tree_node to be sorted, then using a std::set<> in place of the std::vector<> and then adding an operator<() to the tree_node object will greatly improve 'search' performance of an 'T'-like object.

It turns out that they were lazy and actually made your first example Undefined Behavior.

@Mehrdad: I finally decided to ask for the detail behind your comment here.

many of the algorithms simply don't make any sense for a hierarchy. A matter of interpretation. Imagine a structure of stackoverflow users and each year you want those with higher amount of reputation points to boss those with lower reputation points. Thus providing BFS iterator and appropriate comparison, every year you just run std::sort(tree.begin(), tree.end()).

By the same token, you could easily build an associative tree (to model unstructured key-value records, like JSON for example) by replacing vector with map in the example above. For full support of a JSON-like structure, you could use variant to define the nodes.

s

systemsfault

If you are looking for a RB-tree implementation, then stl_tree.h might be appropriate for you too.

Strangely this is the only response that actually answers the original question.

Considering he wants a "Heiarchy", It seems safe to assume that anything with "balancing" is the wrong answer.

"This is an internal header file, included by other library headers. You should not attempt to use it directly."

@Dan: Copying it does not constitute using it directly.

J

J.J.

the std::map is based on a red black tree. You can also use other containers to help you implement your own types of trees.

It usually uses red-black trees (Its not required to do so).

GCC uses a tree to implement map. Anyone wanna look at their VC include directory to see what microsoft uses?

// Red-black tree class, designed for use in implementing STL // associative containers (set, multiset, map, and multimap). Grabbed that from my stl_tree.h file.

@J.J. At least in Studio 2010, it uses an internal ordered red-black tree of {key, mapped} values, unique keys class, defined in <xtree>. Don't have access to a more modern version right at the moment.

D

David Sykes

In a way, std::map is a tree (it is required to have the same performance characteristics as a balanced binary tree) but it doesn't expose other tree functionality. The likely reasoning behind not including a real tree data structure was probably just a matter of not including everything in the stl. The stl can be looked as a framework to use in implementing your own algorithms and data structures.

In general, if there's a basic library functionality that you want, that's not in the stl, the fix is to look at BOOST.

Otherwise, there's a bunch of libraries out there, depending on the needs of your tree.

E

Emilio Garavaglia

All STL container are externally represented as "sequences" with one iteration mechanism. Trees don't follow this idiom.

A tree data structure could provide preorder, inorder or postorder traversal via iterators. In fact this is what std::map does.

Yes and no ... it depends on what you mean by "tree". std::map is internally implemented as btree, but externally it appears as a sorted SEQUENCE of PAIRS. Given whatever element you can universally ask who is before and who is after. A general tree structures containing elements each of which contains other does not impose any sorting or direction. You can define iterators that walk a tree structure in many ways (sallow|deep first|last ...) but once you did it, an std::tree container must return one of them from a begin function. And there is no obvious reason to return one or another.

A std::map is generally represented by a balanced binary search tree, not a B-tree. The same argument you have made could apply to std::unordered_set, it has no natural order, yet presents begin and end iterators. The requirement of begin and end is just that it iterates all elements in some deterministic order, not that there has to be a natural one. preorder is a perfectly valid iteration order for a tree.

The implication of your answer is that there is no stl n-tree data structure because it is doesn't have a "sequence" interface. This is simply incorrect.

@EmiloGaravaglia: As evidenced by std::unordered_set, which has no "unique way" of iterating its members (in fact the iteration order is pseudo-random and implementation defined), but is still an stl container - this disproves your point. Iterating over each element in a container is still a useful operation, even if the order is undefined.

4

4LegsDrivenCat

I think there are several reasons why there are no STL trees. Primarily Trees are a form of recursive data structure which, like a container (list, vector, set), has very different fine structure which makes the correct choices tricky. They are also very easy to construct in basic form using the STL.

A finite rooted tree can be thought of as a container which has a value or payload, say an instance of a class A and, a possibly empty collection of rooted (sub) trees; trees with empty collection of subtrees are thought of as leaves.

template<class A>
struct unordered_tree : std::set<unordered_tree>, A
{};

template<class A>
struct b_tree : std::vector<b_tree>, A
{};

template<class A>
struct planar_tree : std::list<planar_tree>, A
{};

One has to think a little about iterator design etc. and which product and co-product operations one allows to define and be efficient between trees - and the original STL has to be well written - so that the empty set, vector or list container is really empty of any payload in the default case.

Trees play an essential role in many mathematical structures (see the classical papers of Butcher, Grossman and Larsen; also the papers of Connes and Kriemer for examples of they can be joined, and how they are used to enumerate). It is not correct to think their role is simply to facilitate certain other operations. Rather they facilitate those tasks because of their fundamental role as a data structure.

However, in addition to trees there are also "co-trees"; the trees above all have the property that if you delete the root you delete everything.

Consider iterators on the tree, probably they would be realised as a simple stack of iterators, to a node, and to its parent, ... up to the root.

template<class TREE>
struct node_iterator : std::stack<TREE::iterator>{
operator*() {return *back();}
...};

However, you can have as many as you like; collectively they form a "tree" but where all the arrows flow in the direction toward the root, this co-tree can be iterated through iterators towards the trivial iterator and root; however it cannot be navigated across or down (the other iterators are not known to it) nor can the ensemble of iterators be deleted except by keeping track of all the instances.

Trees are incredibly useful, they have a lot of structure, this makes it a serious challenge to get the definitively correct approach. In my view this is why they are not implemented in the STL. Moreover, in the past, I have seen people get religious and find the idea of a type of container containing instances of its own type challenging - but they have to face it - that is what a tree type represents - it is a node containing a possibly empty collection of (smaller) trees. The current language permits it without challenge providing the default constructor for container<B> does not allocate space on the heap (or anywhere else) for an B, etc.

I for one would be pleased if this did, in a good form, find its way into the standard.

C

Community

The problem is that there is no one-size-fits-all solution. Moreover, there is not even a one-size-fits-all interface for a tree. That is, it is not even clear which methods such a tree data structure should provide and it is not even clear what a tree is.

This explains why there is no STL support on this: The STL is for data structures that most people need, where basically everyone agrees on what a sensible interface and an efficient implementation is. For trees, such a thing just doesn't exist.

The gory details

If want to understand further what the problem is, read on. Otherwise, the paragraph above already should be sufficent to answer your question.

I said that there is not even a common interface. You might disagree, since you have one application in mind, but if you think further about it, you will see that there are countless possible operations on trees. You can either have a data structure that enables most of them efficiently, but therefore is more complex overall and has overhead for that complexity, or you have more simple data structure that only allows basic operations but these as quick as possible.

If you want the complete story, check out my paper on the topic. There you will find possible interface, asymptotic complexities on different implementations, and a general description of the problem and also related work with more possible implementations.

What is a tree?

It already starts with what you consider to be a tree:

Rooted or unrooted: most programmers want rooted, most mathematicians want unrooted. (If you wonder what unrooted is: A - B - C is a tree where either A, B, or C could be the root. A rooted tree defines which one is. An unrooted doesn't)

Single root/connected or multi root/disconnected (tree or forest)

Is sibling order relevant? If no, then can the tree structure internally reorder children on updates? If so, iteration order among siblings is no longer defined. But for most trees, sibiling order is actually not meaningful, and allowing the data structure to reorder children on update is very beneficial for some updates.

Really just a tree, or also allow DAG edges (sounds weird, but many people who initially want a tree eventually want a DAG)

Labeled or unlabled? Do you need to store any data per node, or is it only the tree structure you're interested in (the latter can be stored very succinctly)

Query operations

After we have figured out what we define to be a tree, we should define query operations: Basic operations might be "navigate to children, navigate to parent", but there are way more possible operations, e.g.:

Navigate to next/prev sibling: Even most people would consider this a pretty basic operation, it is actually almost impossible if you only have a parent pointer or a children array. So this already shows you that you might need a totally different implementation based on what operations you need.

Navigate in pre/post order

Subtree size: the number of (transitive) descendants of the current node (possibly in O(1) or O(log n), i.e., don't just enumerate them all to count)

the height of the tree in the current node. That is, the longest path from this node to any leave node. Again, in less than O(n).

Given two nodes, find the least common ancestor of the node (with O(1) memory consumption)

How many nodes are between node A and node B in a pre-/post-order traversal? (less than O(n) runtime)

I emphasized that the interesting thing here is whether these methods can be performed better than O(n), because just enumerating the whole tree is always an option. Depending on your application, it might be absolutely crucial that some operations are faster than O(n), or you might not care at all. Again, you will need vastely different data structures depending on your needs here.

Update operations

Until now, I only talked about query opertions. But now to updates. Again, there are various ways in which a tree could be updated. Depending on which you need, you need a more or less sophisticated data structure:

leaf updates (easy): Delete or add a leaf node

inner node updates (harder): Move or delete move an inner node, making its children the children of its parent

subtree updates (harder): Move or delete a subtree rooted in a node

To just give you some intuition: If you store a child array and your sibling order is important, even deleting a leaf can be O(n) as all siblings behind it have to be shifted in the child array of its parent. If you instead only have a parent pointer, leaf deletion is trivially O(1). If you don't care about sibiling order, it is also O(1) for the child array, as you can simply replace the gap with the last sibling in the array. This is just one example where different data structures will give you quite different update capabilities.

Moving a whole subtree is again trivially O(1) in case of a parent pointer, but can be O(n) if you have a data structure storing all nodes e.g., in pre-order.

Then, there are orthogonal considerations like which iterators stay valid if you perform updates. Some data structures need to invalidate all iterators in the whole tree, even if you insert a new leaf. Others only invalidate iterators in the part of the tree that is altered. Others keep all iterators (except the ones for deleted nodes) valid.

Space considerations

Tree structures can be very succinct. Roughly two bits per node are enough if you need to save on space (e.g., DFUDS or LOUDS, see this explanation to get the gist). But of course, naively, even a parent pointer is already 64 bits. Once you opt for a nicely-navigable structure, you might rather require 20 bytes per node.

With a lot of sophisication, one can also build a data structure that only takes some bits per entry, can be updated efficiently, and still enables all query operations asymptotically fast, but this is a beast of a structure that is highly complex. I once gave a practical course where I had grad students implement this paper. Some of them were able to implement it in 6 weeks (!), others failed. And while the structure has great asymptotics, its complexity makes it have quite some overhead for very simple operations.

Again, no one-size-fits-all.

Conclusion

I worked 5 years on finding the best data structure to represent a tree, and even though I came up with some and there is quite some related work, my conclusion was that there is not one. Depending on the use case, a highly sophsticated data struture will be outperformed by a simple parent pointer. Even defining the interface for a tree is hard. I tried defining one in my paper, but I have to acknowledge that there are various use cases where the interface I defined is too narrow or too large. So I doubt that this will ever end up in STL, as there are just too many tuning knobs.

P

Paul Nathan

Because the STL is not an "everything" library. It contains, essentially, the minimum structures needed to build things.

Binary trees are an extremely basic functionality, and in fact, more basic than other containers since types like std::map, std::multimap, and stl::set. Since those types are based on them, you would expect the underlying type to be exposed.

I don't think the OP is asking for a binary tree, he's asking for a tree to store a hierarchy.

Not only that, adding a tree "container" to STL would have mean to add many many new concepts, for example a tree navigator (generalizing Iterator).

"Minimum structures to build things" is a very subjective statement. You can build things with raw C++ concepts, so I gues true minimum would be no STL at all.

The standard library / STL is minimal compared to extensive library support in other environments like .NET and JAVA. I wish it would be more extensive so that for basic things (like XML, JSON; serialization; networking; gui) you don't have to include external libraries. A raw (unbalanced) tree could be an addition as other containers like a vector with sbo; circular_buffer; better hash map; dynamic_bitset with sbo; AVL and B-tree's; etc.)

r

roffez

This one looks promising and seems to be what you're looking for: http://tree.phi-sci.com/

b

bobobobo

IMO, an omission. But I think there is good reason not to include a Tree structure in the STL. There is a lot of logic in maintaining a tree, which is best written as member functions into the base TreeNode object. When TreeNode is wrapped up in an STL header, it just gets messier.

For example:

template <typename T>
struct TreeNode
{
  T* DATA ; // data of type T to be stored at this TreeNode

  vector< TreeNode<T>* > children ;

  // insertion logic for if an insert is asked of me.
  // may append to children, or may pass off to one of the child nodes
  void insert( T* newData ) ;

} ;

template <typename T>
struct Tree
{
  TreeNode<T>* root;

  // TREE LEVEL functions
  void clear() { delete root ; root=0; }

  void insert( T* data ) { if(root)root->insert(data); } 
} ;

That's a lot of owning raw pointers you have there, many of which have no need of being pointers at all.

Suggest you withdraw this answer. A TreeNode class is part of a tree implementation.

i

igagis

Reading through the answers here the common named reasons are that one cannot iterate through the tree or that the tree does not assume the similar interface to other STL containers and one could not use STL algorithms with such tree structure.

Having that in mind I tried to design my own tree data structure which will provide STL-like interface and will be usable with existing STL algorthims as much as possible.

My idea was that the tree must be based on the existing STL containers and that it must not hide the container, so that it will be accessible to use with STL algorithms.

The other important feature the tree must provide is the traversing iterators.

Here is what I was able to come up with: https://github.com/cppfw/utki/blob/master/src/utki/tree.hpp

And here are the tests: https://github.com/cppfw/utki/blob/master/tests/unit/src/tree.cpp

7

7890

All STL containers can be used with iterators. You can't have an iterator an a tree, because you don't have ''one right'' way do go through the tree.

But you can say BFS or DFS is the correct way. Or support both of them. Or any other you can imagine. Jut tell the user what it is.

in std::map there is tree iterator.

A tree could define its own custom iterator type that traverses all nodes in order from one "extreme" to the other (i.e. for any binary tree with paths 0 & 1, it could offer an iterator that goes from "all 0s" to "all 1s", and a reverse iterator that does the opposite; for a tree with a depth of 3 and starting node s, for example, it could iterate over the nodes as s000, s00, s001, s0, s010, s01, s011, s, s100, s10, s101, s1, s110, s11, s111 ("leftmost" to "rightmost"); it could also use a depth traversal pattern (s, s0, s1, s00, s01, s10, s11,

, etc.), or some other pattern, as long as it iterates over every node in such a way that each one is only passed a single time.

@doc, very good point. I think std::unordered_set was "made" a sequence because we don't know a better way of iterating over the elements other than some arbitrary way (internally given by the hash function). I think it is the opposite case of the tree: the iteration over unordered_set is underspecified, in theory there is "no way" of defining an iteration other than perhaps "randomly". In the case of tree there are many "good" (non random) ways. But, again, your point is valid.

Why does the C++ STL not provide any "tree" containers?

Follow WeChat

Want to stay one step ahead of the latest teleworks?

相似问题

Platform

Support

Contact US