ChatGPT解决这个技术问题 Extra ChatGPT

How do I create an empty array and then append to it in NumPy?

I want to create an empty array and append items to it, one at a time.

xs = []
for item in data:
    xs.append(item)

Can I use this list-style notation with NumPy arrays?


M
Mateen Ulhaq

That is the wrong mental model for using NumPy efficiently. NumPy arrays are stored in contiguous blocks of memory. To append rows or columns to an existing array, the entire array needs to be copied to a new block of memory, creating gaps for the new elements to be stored. This is very inefficient if done repeatedly.

Instead of appending rows, allocate a suitably sized array, and then assign to it row-by-row:

>>> import numpy as np

>>> a = np.zeros(shape=(3, 2))
>>> a
array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])

>>> a[0] = [1, 2]
>>> a[1] = [3, 4]
>>> a[2] = [5, 6]

>>> a
array([[ 1.,  2.],
       [ 3.,  4.],
       [ 5.,  6.]])

There is also numpy.empty() if you don't need to zero the array.
What's the benefit of using empty() over zeros()?
that if you're going to initialize it with your data straight away, you save the cost of zeroing it.
@maracorossi so .empty() means one can find random values in the cells, but the array is created quicker than e.g. with .zeros() ?
@user3085931 yep !
P
Peter Mortensen

A NumPy array is a very different data structure from a list and is designed to be used in different ways. Your use of hstack is potentially very inefficient... every time you call it, all the data in the existing array is copied into a new one. (The append function will have the same issue.) If you want to build up your matrix one column at a time, you might be best off to keep it in a list until it is finished, and only then convert it into an array.

e.g.


mylist = []
for item in data:
    mylist.append(item)
mat = numpy.array(mylist)

item can be a list, an array or any iterable, as long as each item has the same number of elements.
In this particular case (data is some iterable holding the matrix columns) you can simply use


mat = numpy.array(data)

(Also note that using list as a variable name is probably not good practice since it masks the built-in type by that name, which can lead to bugs.)

EDIT:

If for some reason you really do want to create an empty array, you can just use numpy.array([]), but this is rarely useful!


Are numpy arrays/matrices fundamentally different from Matlab ones?
If for some reason you need to define an empty array, but with fixed width (e.g. np.concatenate()), you can use: np.empty((0, some_width)). 0, so your first array won't be garbage.
F
Franck Dernoncourt

To create an empty multidimensional array in NumPy (e.g. a 2D array m*n to store your matrix), in case you don't know m how many rows you will append and don't care about the computational cost Stephen Simmons mentioned (namely re-buildinging the array at each append), you can squeeze to 0 the dimension to which you want to append to: X = np.empty(shape=[0, n]).

This way you can use for example (here m = 5 which we assume we didn't know when creating the empty matrix, and n = 2):

import numpy as np

n = 2
X = np.empty(shape=[0, n])

for i in range(5):
    for j  in range(2):
        X = np.append(X, [[i, j]], axis=0)

print X

which will give you:

[[ 0.  0.]
 [ 0.  1.]
 [ 1.  0.]
 [ 1.  1.]
 [ 2.  0.]
 [ 2.  1.]
 [ 3.  0.]
 [ 3.  1.]
 [ 4.  0.]
 [ 4.  1.]]

This should be the answer to the question OP asked, for the use case where you don't know #rows in advance, or want to handle the case that there are 0 rows
While this does work as the OP asked, it is not a good answer. If you know the iteration range you know the target array size.
But there are of course plenty of examples where you don't know the iteration range and you don't care about the computational cost. Good answer in that case!
g
gsamaras

I looked into this a lot because I needed to use a numpy.array as a set in one of my school projects and I needed to be initialized empty... I didn't found any relevant answer here on Stack Overflow, so I started doodling something.

# Initialize your variable as an empty list first
In [32]: x=[]
# and now cast it as a numpy ndarray
In [33]: x=np.array(x)

The result will be:

In [34]: x
Out[34]: array([], dtype=float64)

Therefore you can directly initialize an np array as follows:

In [36]: x= np.array([], dtype=np.float64)

I hope this helps.


This does not work for arrays, as in the question, but it can be useful for vectors.
a=np.array([]) seems to default to float64
p
pradyunsg

You can use the append function. For rows:

>>> from numpy import *
>>> a = array([10,20,30])
>>> append(a, [[1,2,3]], axis=0)
array([[10, 20, 30],      
       [1, 2, 3]])

For columns:

>>> append(a, [[15],[15]], axis=1)
array([[10, 20, 30, 15],      
       [1, 2, 3, 15]])

EDIT Of course, as mentioned in other answers, unless you're doing some processing (ex. inversion) on the matrix/array EVERY time you append something to it, I would just create a list, append to it then convert it to an array.


How does this answer the question? I don't see the part about empty arrays
P
Pedram

For creating an empty NumPy array without defining its shape you can do the following:

arr = np.array([])

The first one is preferred because you know you will be using this as a NumPy array. NumPy converts this to np.ndarray type afterward, without extra [] 'dimension'.

for adding new element to the array us can do:

arr = np.append(arr, 'new element')

Note that in the background for python there's no such thing as an array without defining its shape. as @hpaulj mentioned this also makes a one-rank array.


No., np.array([]) creates an array with shape (0,), a 1d array with 0 elements. There's no such thing as an array without defined shape. And 2) does the same thing as 1).
It's true @hpaulj although the whole point of the discussion is to not think mentally about the shape when you're creating one. worth mentioning that anyway.
D
Darius

Here is some workaround to make numpys look more like Lists

np_arr = np.array([])
np_arr = np.append(np_arr , 2)
np_arr = np.append(np_arr , 24)
print(np_arr)

OUTPUT: array([ 2., 24.])


Stay away from np.append. It's not a list append clone, despite the poorly chosen name.
c
cyborg

If you absolutely don't know the final size of the array, you can increment the size of the array like this:

my_arr = numpy.zeros((0,5))
for i in range(3):
    my_arr=numpy.concatenate( ( my_arr, numpy.ones((1,5)) ) )
print(my_arr)

[[ 1.  1.  1.  1.  1.]  [ 1.  1.  1.  1.  1.]  [ 1.  1.  1.  1.  1.]]

Notice the 0 in the first line.

numpy.append is another option. It calls numpy.concatenate.


A
Ali G

You can apply it to build any kind of array, like zeros:

a = range(5)
a = [i*0 for i in a]
print a 
[0, 0, 0, 0, 0]

If you want to do that in pure python, a= [0] * 5 is the simple solution
B
Brent Bradburn

Depending on what you are using this for, you may need to specify the data type (see 'dtype').

For example, to create a 2D array of 8-bit values (suitable for use as a monochrome image):

myarray = numpy.empty(shape=(H,W),dtype='u1')

For an RGB image, include the number of color channels in the shape: shape=(H,W,3)

You may also want to consider zero-initializing with numpy.zeros instead of using numpy.empty. See the note here.


S
SteveTz

Another simple way to create an empty array that can take array is:

import numpy as np
np.empty((2,3), dtype=object)

r
runo

I think you want to handle most of the work with lists then use the result as a matrix. Maybe this is a way ;

ur_list = []
for col in columns:
    ur_list.append(list(col))

mat = np.matrix(ur_list)

v
veeresh d

I think you can create empty numpy array like:

>>> import numpy as np
>>> empty_array= np.zeros(0)
>>> empty_array
array([], dtype=float64)
>>> empty_array.shape
(0,)

This format is useful when you want to append numpy array in the loop.


E
Edgar Duarte

Perhaps what you are looking for is something like this:

x=np.array(0)

In this way you can create an array without any element. It similar than:

x=[]

This way you will be able to append new elements to your array in advance.


No, your x is a an array with shape (), and one element. It is more like 0 than []. You could call it a 'scalar array'.
u
user3810512

The simplest way

Input:

import numpy as np
data = np.zeros((0, 0), dtype=float)   # (rows,cols)
data.shape

Output: (0, 0)

Input:

for i in range(n_files):
     data = np.append(data, new_data, axis = 0)