ChatGPT解决这个技术问题 Extra ChatGPT

Is there a sorted_vector class, which supports insert() etc.?

Often, it is more efficient to use a sorted std::vector instead of a std::set. Does anyone know a library class sorted_vector, which basically has a similar interface to std::set, but inserts elements into the sorted vector (so that there are no duplicates), uses binary search to find elements, etc.?

I know it's not hard to write, but probably better not to waste time and use an existing implementation anyway.

Update: The reason to use a sorted vector instead of a set is: If you have hundreds of thousands of little sets that contain only 10 or so members each, it is more memory-efficient to just use sorted vectors instead.

I don't think there's a ready-made class for that. You may write your own or use lower_bound() for insertion and binary_search() for lookup.
If the vectors are so small, the difference between binary and sequential search is likely to be small too, so you may as well just use a std::vector.
The difference will probably be quite large because of the cache misses that the set will incur.
@Frank: I'm a bit late to this question, but anyway :) You should check if binary search in a sorted vector of "10 or so" elements is any faster than just a linear search. It is quite possible that it isn't faster, or it could even be slower, as processor's branch prediction will play an important role in this case.

E
Evgeny Panasyuk

Boost.Container flat_set

Boost.Container flat_[multi]map/set containers are ordered-vector based associative containers based on Austern's and Alexandrescu's guidelines. These ordered vector containers have also benefited recently with the addition of move semantics to C++, speeding up insertion and erasure times considerably. Flat associative containers have the following attributes: Faster lookup than standard associative containers Much faster iteration than standard associative containers. Less memory consumption for small objects (and for big objects if shrink_to_fit is used) Improved cache performance (data is stored in contiguous memory) Non-stable iterators (iterators are invalidated when inserting and erasing elements) Non-copyable and non-movable values types can't be stored Weaker exception safety than standard associative containers (copy/move constructors can throw when shifting values in erasures and insertions) Slower insertion and erasure than standard associative containers (specially for non-movable types)

Live demo:

#include <boost/container/flat_set.hpp>
#include <iostream>
#include <ostream>

using namespace std;

int main()
{
    boost::container::flat_set<int> s;
    s.insert(1);
    s.insert(2);
    s.insert(3);
    cout << (s.find(1)!=s.end()) << endl;
    cout << (s.find(4)!=s.end()) << endl;
}

jalf: If you want a sorted vector, it is likely better to insert all the elements, and then call std::sort() once, after the insertions.

boost::flat_set can do that automatically:

template<typename InputIterator> 
flat_set(InputIterator first, InputIterator last, 
         const Compare & comp = Compare(), 
         const allocator_type & a = allocator_type());

Effects: Constructs an empty set using the specified comparison object and allocator, and inserts elements from the range [first, last). Complexity: Linear in N if the range [first, last) is already sorted using comp and otherwise N*log(N), where N is last - first.


j
jalf

The reason such a container is not part of the standard library is that it would be inefficient. Using a vector for storage means objects have to be moved if something is inserted in the middle of the vector. Doing this on every insertion gets needlessly expensive. (On average, half the objects will have to be moved for each insertion. That's pretty costly)

If you want a sorted vector, it is likely better to insert all the elements, and then call std::sort() once, after the insertions.


I dont see how that would solve the problem. All the objects still have to be touched, even if it is only a pointer swap. You're still trying to do something that the data structure just isn't suited for.
I started writing an answer like that, and stopped because it's simply not really true. For less than a few dozen elements, which is pretty common really, moving on average half can easily be less expensive than performing an allocation and a tree rebalance. Of course it's better to call sort once, and I personally wouldn't look for a container to do this, but it's a matter of style.
Inserting n elements into a sorted array is log n to find the insertion point and n/2 to move the existing elements, for n elements. O(nnlog n), not efficient at all. Might work out if n is small enough though.
@Potatoswatter: Replacing it with a node-based datastructure wasn't my suggested alternative though. Like you say, the heap allocations and tree rebalancing gets pricey too (although a custom allocator could help somewhat). Sorting once, at the end, was my suggestion.
I suggest a combination of std::map and std::vector as a solution.
M
Michael Burr

I think there's not 'sorted container' adapter in the STL because there are already the appropriate associative containers for keeping things sorted that would be appropriate to use in nearly all cases. To be honest, about the only reason I can think of off the top of my head for having a sorted vector<> container might be to interoperate with C functions that expect a sorted array. Of course, I may be missing something.

If you feel that a sorted vector<> would be more appropriate for your needs (being aware of the shortcomings of inserting elements into a vector), here's an implementation on Code Project:

An STL compliant sorted vector By Martin Holzherr

I've never used it, so I can't vouch for it (or its license - if any is specified). But a quick read of the article and it looks like the author at least made a good effort for the container adapter to have an appropriate STL interface.

It seems to be worth a closer look.


A sorted vector is likely to be faster until the set gets fairly big (100's of elements). Sets have horrible cache-locality.
B
Billy ONeal

If you decide to roll your own, you might also want to check out boost:ublas. Specifically:

#include <boost/numeric/ublas/vector_sparse.hpp>

and look at coordinate_vector, which implements a vector of values and indexes. This data structure supports O(1) insertion (violating the sort), but then sorts on-demand Omega(n log n). Of course, once it's sorted, lookups are O(logn). If part of the array is sorted, the algorithm recognizes this and sorts only the newly added elements, then does an inplace merge. If you care about efficiency, this is probably the best you can do.


M
Mooing Duck

Alexandresu's Loki has a sorted vector implementation, if you dont want to go through the relativley insignicant effort of rolling you own.

http://loki-lib.sourceforge.net/html/a00025.html


m
moodboom

Here is my sorted_vector class that I've been using in production code for years. It has overloads to let you use a custom predicate. I've used it for containers of pointers, which can be a really nice solution in a lot of use cases.


关注公众号,不定期副业成功案例分享
Follow WeChat

Success story sharing

Want to stay one step ahead of the latest teleworks?

Subscribe Now