ChatGPT解决这个技术问题 Extra ChatGPT

Best way to extract a subvector from a vector?

Suppose I have a std::vector (let's call it myVec) of size N. What's the simplest way to construct a new vector consisting of a copy of elements X through Y, where 0 <= X <= Y <= N-1? For example, myVec [100000] through myVec [100999] in a vector of size 150000.

If this cannot be done efficiently with a vector, is there another STL datatype that I should use instead?

you say you want to extract a subvector, but it seems to me that what you really want is a view / access to the subvector - difference being that a view would not copy - old school C++ would be to use start pointer and end pointer, given the fact that mem on a std::vector is contiguous, then it should be possible for you to iterate using pointers and thereby avoid copy, however if you do not mind copy, then just initialize a new vector with the scope of your previous vector
There is .data()(cplusplus.com/reference/vector/vector/data) since c++11. However, using pointers is discouraged within stl containers , see stackoverflow.com/questions/31663770/…
@serup maybe not interested to OP but I would need to know how to " initialize a new vector with the scope of your previous vector".

G
Greg Rogers
vector<T>::const_iterator first = myVec.begin() + 100000;
vector<T>::const_iterator last = myVec.begin() + 101000;
vector<T> newVec(first, last);

It's an O(N) operation to construct the new vector, but there isn't really a better way.


+1, also it's O(Y-X), which is less than or equal to O(N) (and in his example much less)
@orip Well, then it's O(N) after all.
@GregRogers: It doesn't make sense to use the big-O notation where N is a specific number. Big-O communicates the rate of growth with respect to how N changes. Johann: It's best not to use one variable name in two ways. We'd normally say either O(Y-X), or we'd say O(Z) where Z=Y-X.
@GregRogers By using this way, we need to declare a new vector. Is there a way to change the original vector? something like myVec(first, last)? I know this is wrong, but I really need the solution as I want to use recursion in my codes, and need to repeatedly use the same vector (although changed). Thanks!
Why not just vector<T> newVec(myVec.begin() + 100000, myVec.begin() + 101000);?
M
Martin York

Just use the vector constructor.

std::vector<int>   data();
// Load Z elements into data so that Z > Y > X

std::vector<int>   sub(&data[100000],&data[101000]);

Ok, I didn't realize it was that simple to obtain an iterator from an arbitrary vector element.
Taking the address of those vector elements is an unportable hack that will break if the vector storage is not in fact contiguous. Use begin() + 100000 etc.
My bad, apparently the standard guarantees that vector storage is contiguous. Nevertheless it's bad practice to work with addresses like this as it is certainly not guaranteed to work for all containers supporting random access, while begin() + 100000 is.
@j_random_hacker: Sorry have to disagree. The STL specification for std::vector was explicitly changed to support this type of procedure. Also a pointer is valid type of iterator. Look up iterator_traits<>
@taktak004 Nope. Remember that operator[] returns a reference. It is only at the point where you read or write the reference that it would become an access violation. Since we do neither but instead get the address we have not invoked UB,.
D
Dávid Tóth

This discussion is pretty old, but the simplest one isn't mentioned yet, with list-initialization:

 vector<int> subvector = {big_vector.begin() + 3, big_vector.end() - 2}; 

It requires c++11 or above.

Example usage:

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

int main(){

    vector<int> big_vector = {5,12,4,6,7,8,9,9,31,1,1,5,76,78,8};
    vector<int> subvector = {big_vector.begin() + 3, big_vector.end() - 2};

    cout << "Big vector: ";
    for_each(big_vector.begin(), big_vector.end(),[](int number){cout << number << ";";});
    cout << endl << "Subvector: ";
    for_each(subvector.begin(), subvector.end(),[](int number){cout << number << ";";});
    cout << endl;
}

Result:

Big vector: 5;12;4;6;7;8;9;9;31;1;1;5;76;78;8;
Subvector: 6;7;8;9;9;31;1;1;5;76;

it defines the subrange: Starts from the 3rd element, and goes until the 2nd from behind
j
jackw11111

std::vector<T>(input_iterator, input_iterator), in your case foo = std::vector<T>(myVec.begin () + 100000, myVec.begin () + 150000);, see for example here


Since Andrew is trying to construct a new vector, I would recommend "std::vector foo(..." instead of copying with "foo = std::vector(..."
Yeah, of course, but whether you type std::vector foo = std::vector(...) or std::vector foo (...) should not matter.
e
einpoklum

These days, we use spans! So you would write:

#include <gsl/span>

...
auto start_pos = 100000;
auto length = 1000;
auto span_of_myvec = gsl::make_span(myvec);
auto my_subspan = span_of_myvec.subspan(start_pos, length);

to get a span of 1000 elements of the same type as myvec's. Or a more terse form:

auto my_subspan = gsl::make_span(myvec).subspan(1000000, 1000);

(but I don't like this as much, since the meaning of each numeric argument is not entirely clear; and it gets worse if the length and start_pos are of the same order of magnitude.)

Anyway, remember that this is not a copy, it's just a view of the data in the vector, so be careful. If you want an actual copy, you could do:

std::vector<T> new_vec(my_subspan.cbegin(), my_subspan.cend());

Notes:

gsl stands for Guidelines Support Library. For more information about gsl, see: http://www.modernescpp.com/index.php/c-core-guideline-the-guidelines-support-library.

There are several gsl implementations . For example: https://github.com/martinmoene/gsl-lite

C++20 provides an implementation of span. You would use std::span and #include rather than #include .

For more information about spans, see: What is a "span" and when should I use one?

std::vector has a gazillion constructors, it's super-easy to fall into one you didn't intend to use, so be careful.


would use cbegin and cend just for the principle ;) std::cbegin etc even.
@JHBonarius: Seeing how this code isn't templated on the choice of container, I don't see there's a particular benefit; a matter of taste I suppose.
E
Eclipse

If both are not going to be modified (no adding/deleting items - modifying existing ones is fine as long as you pay heed to threading issues), you can simply pass around data.begin() + 100000 and data.begin() + 101000, and pretend that they are the begin() and end() of a smaller vector.

Or, since vector storage is guaranteed to be contiguous, you can simply pass around a 1000 item array:

T *arrayOfT = &data[0] + 100000;
size_t arrayOfTLength = 1000;

Both these techniques take constant time, but require that the length of data doesn't increase, triggering a reallocation.


This is also good if you want the original vector and the subvector to be linked.
M
MasterHD

You didn't mention what type std::vector<...> myVec is, but if it's a simple type or struct/class that doesn't include pointers, and you want the best efficiency, then you can do a direct memory copy (which I think will be faster than the other answers provided). Here is a general example for std::vector<type> myVec where type in this case is int:

typedef int type; //choose your custom type/struct/class
int iFirst = 100000; //first index to copy
int iLast = 101000; //last index + 1
int iLen = iLast - iFirst;
std::vector<type> newVec;
newVec.resize(iLen); //pre-allocate the space needed to write the data directly
memcpy(&newVec[0], &myVec[iFirst], iLen*sizeof(type)); //write directly to destination buffer from source buffer

I wonder if with -O3, @Anteru's "using constructor" std::vector(myVec.begin () + 100000, myVec.begin () + 150000); , wouldn't the longer-version of this produce into exactly the same assembly?
MSVC++ 2015, for example, compiles std::vector<>(iter, iter) to memmove(), if appropriate (if constructor is trivial, for a suitable definition of trivial).
Don't call memcpy. Do a std::copy or a constructor which accepts a range (two iterators), and the compiler and the std.library will conspire to call memcpy when appropriate.
M
MasterAler

You could just use insert

vector<type> myVec { n_elements };

vector<type> newVec;

newVec.insert(newVec.begin(), myVec.begin() + X, myVec.begin() + Y);

Y
Yuval F

You can use STL copy with O(M) performance when M is the size of the subvector.


Upvoted because it pointed me in the right direction but I can see why @LokiAstari suggests it's not the correct choice - since the STL::copy works with two std::vector arrays of the same size and type. Here, the OP wants to copy a subsection into a new, smaller array as outlined here in the OP's post: "0 <= X <= Y <= N-1"
@Andrew, see example using std::copy and std::back_inserter
@LokiAstari why not?
@LokiAstari I was referring to an edit to this that did not survive peer review, which posed the example
vector newvec; std::copy(myvec.begin()+10000, myvec.begin() +10100, std::back_inserter(newvec));
in this case, you don't need to build the destination first, but sure, direct initialization is more... direct.
@chrisg: Its also two lines. Additionally you need to stick a third line in to make sure it is efficient. newvec.reserve(10100 - 10000);. ITs definitely an option and technically it will work. But out of the two which are you going to recommend?
D
Daniel Spiewak

The only way to project a collection that is not linear time is to do so lazily, where the resulting "vector" is actually a subtype which delegates to the original collection. For example, Scala's List#subseq method create a sub-sequence in constant time. However, this only works if the collection is immutable and if the underlying language sports garbage collection.


in c++ way to do that would be to have vector of shared_ptr to X instead of vector of X and then copy SPs, but unfortunately I dont think that is faster because atomic operation involved with cpying SP. Or the original vector could be a const shared_ptr of vector instead and you just take reference to subrange in it. ofc you dont need to make it a shared_ptr of vector but then you have lifetime problems... all this is off top of my head, could be wrong...
M
Meshu Deb Nath

Suppose there are two vectors.

 vector<int> vect1{1, 2, 3, 4};
 vector<int> vect2;

Method 1. Using copy function. copy(first_iterator_index, last_iterator_index, back_inserter()) :- This function takes 3 arguments, firstly, the first iterator of old vector. Secondly, the last iterator of old vector and third is back_inserter function to insert values from back.

    // Copying vector by copy function
    copy(vect1.begin(), vect1.end(), back_inserter(vect2));

Method 2. By using Assign Function. assign(first_iterator_o, last_iterator_o). This method assigns the same values to new vector as old one. This takes 2 arguments, first iterator to old vector and last iterator to old vector.

    //Copying vector by assign function
    vect2.assign(vect1.begin(), vect1.end());

C
Community

Maybe the array_view/span in the GSL library is a good option.

Here is also a single file implementation: array_view.


Kindly add answer here along with link. As external link might change in future
J
Jishu Dohare

Copy elements from one vector to another easily In this example, I am using a vector of pairs to make it easy to understand `

vector<pair<int, int> > v(n);

//we want half of elements in vector a and another half in vector b
vector<pair<lli, lli> > a(v.begin(),v.begin()+n/2);
vector<pair<lli, lli> > b(v.begin()+n/2, v.end());


//if v = [(1, 2), (2, 3), (3, 4), (4, 5), (5, 6)]
//then a = [(1, 2), (2, 3)]
//and b = [(3, 4), (4, 5), (5, 6)]

//if v = [(1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7)]
//then a = [(1, 2), (2, 3), (3, 4)]
//and b = [(4, 5), (5, 6), (6, 7)]

' As you can see you can easily copy elements from one vector to another, if you want to copy elements from index 10 to 16 for example then we would use

vector<pair<int, int> > a(v.begin()+10, v.begin+16);

and if you want elements from index 10 to some index from end, then in that case

vector<pair<int, int> > a(v.begin()+10, v.end()-5);

hope this helps, just remember in the last case v.end()-5 > v.begin()+10


J
JHBonarius

Yet another option: Useful for instance when moving between a thrust::device_vector and a thrust::host_vector, where you cannot use the constructor.

std::vector<T> newVector;
newVector.reserve(1000);
std::copy_n(&vec[100000], 1000, std::back_inserter(newVector));

Should also be complexity O(N)

You could combine this with top anwer code

vector<T>::const_iterator first = myVec.begin() + 100000;
vector<T>::const_iterator last = myVec.begin() + 101000;
std::copy(first, last, std::back_inserter(newVector));

m
mrrgu

Posting this late just for others..I bet the first coder is done by now. For simple datatypes no copy is needed, just revert to good old C code methods.

std::vector <int>   myVec;
int *p;
// Add some data here and set start, then
p=myVec.data()+start;

Then pass the pointer p and a len to anything needing a subvector.

notelen must be!! len < myVec.size()-start


This doesn't perform a copy.