ChatGPT解决这个技术问题 Extra ChatGPT

What is the best way to concatenate two vectors?

I'm using multitreading and want to merge the results. For example:

std::vector<int> A;
std::vector<int> B;
std::vector<int> AB;

I want AB to have to contents of A and the contents of B in that order. What's the most efficient way of doing something like this?

If looking for efficiency when you work with large size containers, it might be more efficient to use list, where you can splice one to the other with several pointer operations. But list has space overhead (consider using single linked list).
Does this answer your question? Concatenating two std::vectors

K
Kirill V. Lyadvinsky
AB.reserve( A.size() + B.size() ); // preallocate memory
AB.insert( AB.end(), A.begin(), A.end() );
AB.insert( AB.end(), B.begin(), B.end() );

Thanks! Wouldn't have thought of reserve.
it should copy each element, so it's O(n)
Not sure whether to ask a new question or not, but can this answer be improved when taking move semantics into account? Is there some way I can expect/instruct the compiler to do a single memory move instead of looping over all elements?
@boycy No. It is amortized constant time to push_back one element. To push back n elements is O(n)
@Konrad I didn't imply otherwise, but thanks for the clarification. Note the complexity of an insert operation is never given in terms of the number of elements being inserted - which will always give O(n) - but in terms of the number of elements already in the container, since that provides a measure of its scalability.
R
Robbie Morrison

This is precisely what the member function std::vector::insert is for

std::vector<int> AB = A;
AB.insert(AB.end(), B.begin(), B.end());

@Nick: Slow compared to what?
Maybe that it checks for enough space on each insert of element? Using reserve beforehand will speed it up.
@Nick: I wouldn't be surprised if every modern stdlib implementation specialized insert on random-access iterators and reserved up-front.
@Gman: That's a fair point since we know that the source is also a vector (where iterator distance has O(1) complexity). Still, the performance guarantees of insert are something to be mindful of when you can often do better by planning ahead.
@RvdK checking for space is only a few instructions: load capacity, compare to size, conditional jump; all of which is negligible cost for most cases. Since size < capacity most of the time, branch prediction will likely cause the non-reallocating branch's instructions to be in the instruction pipeline, minimising branch-induced latency except for low iteration count. This assumes a good vector implementation, plus CPU instruction pipeline & [good] branch prediction, but those are pretty reliable assumptions for a modern toolchain and desktop machine. Don't know about smartphones though..
b
bradgonesurfing

Depends on whether you really need to physically concatenate the two vectors or you want to give the appearance of concatenation of the sake of iteration. The boost::join function

http://www.boost.org/doc/libs/1_43_0/libs/range/doc/html/range/reference/utilities/join.html

will give you this.

std::vector<int> v0;
v0.push_back(1);
v0.push_back(2);
v0.push_back(3);

std::vector<int> v1;
v1.push_back(4);
v1.push_back(5);
v1.push_back(6);
...

BOOST_FOREACH(const int & i, boost::join(v0, v1)){
    cout << i << endl;
}

should give you

1
2
3
4
5
6

Note boost::join does not copy the two vectors into a new container but generates a pair of iterators (range) that cover the span of both containers. There will be some performance overhead but maybe less that copying all the data to a new container first.


Nice idea. After thinking for a while I realized this goal may also be accomplished without using boost libraries. I've posted an answer explaining how.
R
Ronald Souza

In the direction of Bradgonesurfing's answer, many times one doesn't really need to concatenate two vectors (O(n)), but instead just work with them as if they were concatenated (O(1)). If this is your case, it can be done without the need of Boost libraries.

The trick is to create a vector proxy: a wrapper class which manipulates references to both vectors, externally seen as a single, contiguous one.

USAGE

std::vector<int> A{ 1, 2, 3, 4, 5};
std::vector<int> B{ 10, 20, 30 };

VecProxy<int> AB(A, B);  // ----> O(1). No copies performed.

for (size_t i = 0; i < AB.size(); ++i)
    std::cout << AB[i] << " ";  // 1 2 3 4 5 10 20 30

IMPLEMENTATION

template <class T>
class VecProxy {
private:
    std::vector<T>& v1, v2;
public:
    VecProxy(std::vector<T>& ref1, std::vector<T>& ref2) : v1(ref1), v2(ref2) {}
    const T& operator[](const size_t& i) const;
    const size_t size() const;
};

template <class T>
const T& VecProxy<T>::operator[](const size_t& i) const{
    return (i < v1.size()) ? v1[i] : v2[i - v1.size()];
};

template <class T>
const size_t VecProxy<T>::size() const { return v1.size() + v2.size(); };

MAIN BENEFIT

It's O(1) (constant time) to create it, and with minimal extra memory allocation.

SOME STUFF TO CONSIDER

You should only go for it if you really know what you're doing when dealing with references. This solution is intended for the specific purpose of the question made, for which it works pretty well. To employ it in any other context may lead to unexpected behavior if you are not sure on how references work.

In this example, AB does not provide a non-const access operator ([ ]). Feel free to include it, but keep in mind: since AB contains references, to assign it values will also affect the original elements within A and/or B. Whether or not this is a desirable feature, it's an application-specific question one should carefully consider.

Any changes directly made to either A or B (like assigning values, sorting, etc.) will also "modify" AB. This is not necessarily bad (actually, it can be very handy: AB does never need to be explicitly updated to keep itself synchronized to both A and B), but it's certainly a behavior one must be aware of. Important exception: to resize A and/or B to sth bigger may lead these to be reallocated in memory (for the need of contiguous space), and this would in turn invalidate AB.

Because every access to an element is preceded by a test (namely, "i < v1.size()"), VecProxy access time, although constant, is also a bit slower than that of vectors.

This approach can be generalized to n vectors. I haven't tried, but it shouldn't be a big deal.


exactly what i needed, thanks a lot
a
aloisdg

Based on Kiril V. Lyadvinsky answer, I made a new version. This snippet use template and overloading. With it, you can write vector3 = vector1 + vector2 and vector4 += vector3. Hope it can help.

template <typename T>
std::vector<T> operator+(const std::vector<T> &A, const std::vector<T> &B)
{
    std::vector<T> AB;
    AB.reserve(A.size() + B.size());                // preallocate memory
    AB.insert(AB.end(), A.begin(), A.end());        // add A;
    AB.insert(AB.end(), B.begin(), B.end());        // add B;
    return AB;
}

template <typename T>
std::vector<T> &operator+=(std::vector<T> &A, const std::vector<T> &B)
{
    A.reserve(A.size() + B.size());                // preallocate memory without erase original data
    A.insert(A.end(), B.begin(), B.end());         // add B;
    return A;                                        // here A could be named AB
}

Do you mean to add the elements of each vector to each other? Or you mean to append? This is clear now but for the next 5 years..? You shouldn't overload operator if meaning is ambiguous.
@S.R I mean to concate. I wrote this answer 3 years ago. I still know what It means. No problem there. If C++ could provide its own overload it will be even better. (and yes :: is taken ;)
Definitely not clear in general that v1 + v2 doesn't represent addition.
@Apollys well
Alternative would be to use @ like in F#
E
Erik Campobadal

One more simple variant which was not yet mentioned:

copy(A.begin(),A.end(),std::back_inserter(AB));
copy(B.begin(),B.end(),std::back_inserter(AB));

And using merge algorithm:

#include <algorithm>
#include <vector>
#include <iterator>
#include <iostream>
#include <sstream>
#include <string>

template<template<typename, typename...> class Container, class T>
std::string toString(const Container<T>& v)
{
    std::stringstream ss;
    std::copy(v.begin(), v.end(), std::ostream_iterator<T>(ss, ""));
    return ss.str();
};


int main()
{
    std::vector<int> A(10);
    std::vector<int> B(5);  //zero filled
    std::vector<int> AB(15);

    std::for_each(A.begin(), A.end(),
            [](int& f)->void
            {
                f = rand() % 100;
            });

    std::cout << "before merge: " << toString(A) << "\n";
    std::cout << "before merge: " << toString(B) << "\n";
    merge(B.begin(),B.end(), begin(A), end(A), AB.begin(), [](int&,int&)->bool {});
    std::cout << "after merge:  " << toString(AB) << "\n";

    return 1;
}


u
user3360767

All the solutions are correct, but I found it easier just write a function to implement this. like this:

template <class T1, class T2>
void ContainerInsert(T1 t1, T2 t2)
{
    t1->insert(t1->end(), t2->begin(), t2->end());
}

That way you can avoid the temporary placement like this:

ContainerInsert(vec, GetSomeVector());

t
tsnorri

For this use case, if you know beforehand the number of results each thread produces, you could preallocate AB and pass a std::span to each thread. This way the concatenation need not be done. Example:

std::vector<int> AB(total_number_of_results, 0);
std::size_t chunk_length = …;
std::size_t chunk2_start = chunk_length;
std::size_t chunk3_start = 2 * chunk_length; // If needed
…
// Pass these to the worker threads.
std::span<int> A(AB.data(), chunk_length);
std::span<int> B(AB.data() + chunk2_start, chunk_length);
…

S
Sudharsan

My answer is based on Mr.Ronald Souza's original solution. In addition to his original solution, I've written a vector proxy that supports iterators too!

short description for people who are not aware of the context of the original solution: the joined_vector template class (i.e the vector proxy)takes two references of two vectors as constructor arguments, it then treats them as one contiguous vector. My implementation also supports a forward-iterator.

USAGE:

int main()
{
    std::vector<int> a1;
    std::vector<int> a2;

    joined_vector<std::vector<int>> jv(a1,a2);

    for (int i = 0; i < 5; i++)
        a1.push_back(i);
    for (int i = 5; i <=10; i++)
        a2.push_back(i);

    for (auto e : jv)
        std::cout << e<<"\n";
    for (int i = 0; i < jv.size(); i++)
        std::cout << jv[i] << "\n";
    return 0;
}

IMPLEMENTATION:

template<typename _vec>
class joined_vector
{
    _vec& m_vec1;
    _vec& m_vec2;

public:

    struct Iterator
    {
        typedef typename _vec::iterator::value_type type_value;
        typedef typename _vec::iterator::value_type* pointer;
        typedef typename _vec::iterator::value_type& reference;
        typedef std::forward_iterator_tag iterator_category;
        typedef std::ptrdiff_t difference_type;
        _vec* m_vec1;
        _vec* m_vec2;
        Iterator(pointer ptr) :m_ptr(ptr)
        {

        }
        Iterator operator++()
        {
            if (m_vec1->size() > 0 && m_ptr == &(*m_vec1)[m_vec1->size() - 1] && m_vec2->size() != 0)
                m_ptr = &(*m_vec2)[0];
            else
                ++m_ptr;
            return m_ptr;
        }
        Iterator operator++(int)
        {
            pointer curr = m_ptr;
            if (m_vec1->size() > 0 && m_ptr == &(*m_vec1)[m_vec1->size() - 1] && m_vec2->size() != 0)
                m_ptr = &(*m_vec2)[0];
            else
                ++m_ptr;
            return curr;
        }
        reference operator *()
        {
            return *m_ptr;
        }
        pointer operator ->()
        {
            return m_ptr;
        }

        friend bool operator == (Iterator& itr1, Iterator& itr2)
        {
            return itr1.m_ptr == itr2.m_ptr;
        }
        friend bool operator != (Iterator& itr1, Iterator& itr2)
        {
            return itr1.m_ptr != itr2.m_ptr;
        }
    private:
        pointer m_ptr;
    };

    joined_vector(_vec& vec1, _vec& vec2) :m_vec1(vec1), m_vec2(vec2)
    {

    }
    Iterator begin()
    {
        //checkes if m_vec1 is empty and gets the first elemet's address,
        //if it's empty then it get's the first address of the second vector m_vec2
        //if both of them are empty then nullptr is returned as the first pointer
        Iterator itr_beg((m_vec1.size() != 0) ? &m_vec1[0] : ((m_vec2.size() != 0) ? &m_vec2[0] : nullptr));
        itr_beg.m_vec1 = &m_vec1;
        itr_beg.m_vec2 = &m_vec2;
        return itr_beg;
    }
    Iterator end()
    {
        //check if m_vec2 is empty and get the last address of that vector
        //if the second vector is empty then the m_vec1's vector/the first vector's last element's address is taken
        //if both of them are empty then a null pointer is returned as the end pointer
        typename _vec::value_type* p = ((m_vec2.size() != 0) ? &m_vec2[m_vec2.size() - 1] : ((m_vec1.size()) != 0 ? &m_vec1[m_vec1.size() - 1] : nullptr));
        Iterator itr_beg(p != nullptr ? p + 1 : nullptr);
        itr_beg.m_vec1 = &m_vec1;
        itr_beg.m_vec2 = &m_vec2;
        return itr_beg;
    }
    typename _vec::value_type& operator [](int i)
    {
        if (i < m_vec1.size())
            return m_vec1[i];
        else
            return m_vec2[i - m_vec1.size()];
    }
    size_t size()
    {
        return m_vec1.size() + m_vec2.size();
    }

};

1
1201ProgramAlarm

If your vectors are sorted*, check out set_union from <algorithm>.

set_union(A.begin(), A.end(), B.begin(), B.end(), AB.begin());

There's a more thorough example in the link.


also, it does not do the same thing as a straight append - the elements in the output range are unique, which may not be what the OP wanted (they might not even be comparable). It's certainly not the most efficient way of doing it.
F
Faker

template<typename T>
vector<T> CombineVectors(vector<T> &&a, vector<T> &&b) {
  vector<T> ab = move(a);
  ab.insert(
      ab.end(), make_move_iterator(b.begin()), make_move_iterator(b.end()));
  return ab;
}


template<typename T>
vector<T> CombineVectors(const vector<T> &a, const vector<T> &b) {
  vector<T> ab = a;
  ab.insert(ab.end(), b.begin(), b.end());
  return ab;
}