ChatGPT解决这个技术问题 Extra ChatGPT

How do you copy the contents of an array to a std::vector in C++ without looping?

I have an array of values that is passed to my function from a different part of the program that I need to store for later processing. Since I don't know how many times my function will be called before it is time to process the data, I need a dynamic storage structure, so I chose a std::vector. I don't want to have to do the standard loop to push_back all the values individually, it would be nice if I could just copy it all using something similar to memcpy.


M
MattyT

There have been many answers here and just about all of them will get the job done.

However there is some misleading advice!

Here are the options:

vector<int> dataVec;

int dataArray[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
unsigned dataArraySize = sizeof(dataArray) / sizeof(int);

// Method 1: Copy the array to the vector using back_inserter.
{
    copy(&dataArray[0], &dataArray[dataArraySize], back_inserter(dataVec));
}

// Method 2: Same as 1 but pre-extend the vector by the size of the array using reserve
{
    dataVec.reserve(dataVec.size() + dataArraySize);
    copy(&dataArray[0], &dataArray[dataArraySize], back_inserter(dataVec));
}

// Method 3: Memcpy
{
    dataVec.resize(dataVec.size() + dataArraySize);
    memcpy(&dataVec[dataVec.size() - dataArraySize], &dataArray[0], dataArraySize * sizeof(int));
}

// Method 4: vector::insert
{
    dataVec.insert(dataVec.end(), &dataArray[0], &dataArray[dataArraySize]);
}

// Method 5: vector + vector
{
    vector<int> dataVec2(&dataArray[0], &dataArray[dataArraySize]);
    dataVec.insert(dataVec.end(), dataVec2.begin(), dataVec2.end());
}

To cut a long story short Method 4, using vector::insert, is the best for bsruth's scenario.

Here are some gory details:

Method 1 is probably the easiest to understand. Just copy each element from the array and push it into the back of the vector. Alas, it's slow. Because there's a loop (implied with the copy function), each element must be treated individually; no performance improvements can be made based on the fact that we know the array and vectors are contiguous blocks.

Method 2 is a suggested performance improvement to Method 1; just pre-reserve the size of the array before adding it. For large arrays this might help. However the best advice here is never to use reserve unless profiling suggests you may be able to get an improvement (or you need to ensure your iterators are not going to be invalidated). Bjarne agrees. Incidentally, I found that this method performed the slowest most of the time though I'm struggling to comprehensively explain why it was regularly significantly slower than method 1...

Method 3 is the old school solution - throw some C at the problem! Works fine and fast for POD types. In this case resize is required to be called since memcpy works outside the bounds of vector and there is no way to tell a vector that its size has changed. Apart from being an ugly solution (byte copying!) remember that this can only be used for POD types. I would never use this solution.

Method 4 is the best way to go. It's meaning is clear, it's (usually) the fastest and it works for any objects. There is no downside to using this method for this application.

Method 5 is a tweak on Method 4 - copy the array into a vector and then append it. Good option - generally fast-ish and clear.

Finally, you are aware that you can use vectors in place of arrays, right? Even when a function expects c-style arrays you can use vectors:

vector<char> v(50); // Ensure there's enough space
strcpy(&v[0], "prefer vectors to c arrays");

Hope that helps someone out there!


You can't safely & portably refer to "&dataArray[dataArraySize]"--it's dereferencing a past-the-end pointer/iterator. Instead, you can say dataArray + dataArraySize to get the pointer without having to dereference it first.
@Drew: yes, you can, at least in C. It is defined that &expr doesn't evaluate expr, it only computes the address of it. And a pointer one past the last element is perfectly valid, too.
Have you tried doing method 4 with 2? i.e. reserving the space before inserting. It seems that if the data size is big, multiple insertions will need multiple reallocations. Because we know the size a priori, we can do the reallocation, before inserting.
@MattyT what is the point of method 5? Why make an intermediate copy of the data?
I personally would rather profit from arrays decaying to pointers automatically: dataVec.insert(dataVec.end(), dataArray, dataArray + dataArraySize); – appears much clearer to me. Cannot gain anything from method 5 either, only looks pretty inefficient – unless compiler is able to optimise the vector away again.
p
phoenix

If you can construct the vector after you've gotten the array and array size, you can just say:

std::vector<ValueType> vec(a, a + n);

...assuming a is your array and n is the number of elements it contains. Otherwise, std::copy() w/resize() will do the trick.

I'd stay away from memcpy() unless you can be sure that the values are plain-old data (POD) types.

Also, worth noting that none of these really avoids the for loop--it's just a question of whether you have to see it in your code or not. O(n) runtime performance is unavoidable for copying the values.

Finally, note that C-style arrays are perfectly valid containers for most STL algorithms--the raw pointer is equivalent to begin(), and (ptr + n) is equivalent to end().


The reason why looping and calling push_back is bad is because you might force the vector to resize multiple times if the array is long enough.
@bradtgmurray: I think any reasonable implementation of the "two iterators" vector constructor I suggested above would call std::distance() first on the two iterators to get the needed number of elements, then allocate just once.
@bradtgmurray: Even push_back() wouldn't be too bad because of the exponential growth of vectors (aka "amortized constant time"). I think runtime would only be on the order of 2x worse in the worst case.
And if the vector is already there, a vec.clear(); vec.insert(vec.begin(), a, a + n); would work as well. Then you wouldn't even require a to be a pointer, just an iterator, and the vector assignment would be failry general (and the C++/STL way).
Another alternative when unable to construct would be assign: vec.assign(a, a+n), which would be more compact than copy & resize.
T
Torlack

If all you are doing is replacing the existing data, then you can do this

std::vector<int> data; // evil global :)

void CopyData(int *newData, size_t count)
{
   data.assign(newData, newData + count);
}

Simple to understand and definitely the fastest solution (it's just a memcpy behind the scenes).
Is deta.assign faster than data.insert?
l
luke

std::copy is what you're looking for.


b
bsruth

Since I can only edit my own answer, I'm going to make a composite answer from the other answers to my question. Thanks to all of you who answered.

Using std::copy, this still iterates in the background, but you don't have to type out the code.

int foo(int* data, int size)
{
   static std::vector<int> my_data; //normally a class variable
   std::copy(data, data + size, std::back_inserter(my_data));
   return 0;
}

Using regular memcpy. This is probably best used for basic data types (i.e. int) but not for more complex arrays of structs or classes.

vector<int> x(size);
memcpy(&x[0], source, size*sizeof(int));

I was going to recommend this approach.
It is most likely more efficient to resize your vector up front if you know the size ahead of time, and not use the back_inserter.
you could add my_data.reserve(size)
Note that internally this is doing exactly what you seem to want to avoid. It is not copying bits, it is just looping and calling push_back(). I guess you only wanted to avoid typing the code?
Wjy not use the vector constructor to copy the data?
T
Toby Speight
int dataArray[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };//source

unsigned dataArraySize = sizeof(dataArray) / sizeof(int);

std::vector<int> myvector (dataArraySize );//target

std::copy ( myints, myints+dataArraySize , myvector.begin() );

//myvector now has 1,2,3,...10 :-)

Whilst this code snippet is welcome, and may provide some help, it would be greatly improved if it included an explanation of how and why this solves the problem. Remember that you are answering the question for readers in the future, not just the person asking now! Please edit your answer to add explanation, and give an indication of what limitations and assumptions apply.
Wait, what's myints?
I guess this example is from cplusplus.com/reference/algorithm/copy, where you can find myints :)
S
Shane Powell

Yet another answer, since the person said "I don't know how many times my function will be called", you could use the vector insert method like so to append arrays of values to the end of the vector:

vector<int> x;

void AddValues(int* values, size_t size)
{
   x.insert(x.end(), values, values+size);
}

I like this way because the implementation of the vector should be able to optimize for the best way to insert the values based on the iterator type and the type itself. You are somewhat replying on the implementation of stl.

If you need to guarantee the fastest speed and you know your type is a POD type then I would recommend the resize method in Thomas's answer:

vector<int> x;

void AddValues(int* values, size_t size)
{
   size_t old_size(x.size());
   x.resize(old_size + size, 0);
   memcpy(&x[old_size], values, size * sizeof(int));
}

A
Assaf Lavie

avoid the memcpy, I say. No reason to mess with pointer operations unless you really have to. Also, it will only work for POD types (like int) but would fail if you're dealing with types that require construction.


Maybe this should be a comment on one of the other answers, as you do not actually propose a solution.
T
Thomas Jones-Low

In addition to the methods presented above, you need to make sure you use either std::Vector.reserve(), std::Vector.resize(), or construct the vector to size, to make sure your vector has enough elements in it to hold your data. if not, you will corrupt memory. This is true of either std::copy() or memcpy().

This is the reason to use vector.push_back(), you can't write past the end of the vector.


If you are using a back_inserter, you don't need to pre-reserve the size of the vector you're copying to. back_inserter does a push_back().
T
Thomas Jones-Low

Assuming you know how big the item in the vector are:

std::vector<int> myArray;
myArray.resize (item_count, 0);
memcpy (&myArray.front(), source, item_count * sizeof(int));

http://www.cppreference.com/wiki/stl/vector/start


Doesn't that depend on the implementation of std::vector?
That's horrible! You are filling the array twice, one with '0's, then with the proper values. Just do: std::vector myArray(source, source + item_count); and trust your compiler to produce the memcpy!
Trust your compiler to produce __memcpy_int_aligned; that should be even faster