如何仅展平numpy数组的某些维度

python numpy flatten

有没有一种快速的方法来“亚展平”或展平 numpy 数组中的一些第一个维度？

例如，给定一个维度为 (50,100,25) 的 numpy 数组，结果维度将是 (5000,25)

这可能有助于stackoverflow.com/questions/13990465/3d-numpy-array-to-2d

您需要有关 numpy ndarray 数组切片的进修课程。也称为多维数组索引，请参阅：docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.html 使用方括号对您的 ndarray 进行数组切片，并使用逗号分隔符分隔您想要的每个维度的数量。它看起来像（不完全是）这样：your_array[50:100, 7, :] 将 3d 对象展平为 2d，仅使用第 7 个切片作为第 2 维。

切片只取一个子集，发布者希望保留所有数据点。我假设您的意思是 array[0:50,7,:]，它给出了大小 (50,25)，丢弃了 99% 的数据。

Alexander

看看 numpy.reshape 。

>>> arr = numpy.zeros((50,100,25))
>>> arr.shape
# (50, 100, 25)

>>> new_arr = arr.reshape(5000,25)
>>> new_arr.shape   
# (5000, 25)

# One shape dimension can be -1. 
# In this case, the value is inferred from 
# the length of the array and remaining dimensions.
>>> another_arr = arr.reshape(-1, arr.shape[-1])
>>> another_arr.shape
# (5000, 25)

这样的解决方案对我来说似乎有点不雅，因为它们需要一些冗余信息。我希望有一种方法可以做到这一点，只需要指定维度的子集，例如 arr.flatten(dimensions=(0, 1))。

@Denziloe 在不指定额外数据将被折叠到哪个维度的情况下，不能简单地“展平” ndarray 的任意维度。以 2x2x3 ndarray 为例，展平最后一个维度可以产生 2x6 或 6x2，因此信息不是多余的。您可以使用 -1 指定维度：从 numpy.reshape 一个形状维度可以是 -1。在这种情况下，该值是从数组的长度和剩余维度推断出来的。因此，将 2x2xN 重新整形为 2Nx2 如下所示：arr.reshape((-1,2))。

@Denziloe 实现此目的的方法可能类似于 arr.reshape(arr.shape[0] * arr.shape[1], arr.shape[2])

@אלימלךשרייבר 有趣的是，torch 似乎以某种方式解决了这个问题：pytorch.org/docs/stable/generated/torch.flatten.html ;)

@SebastianHoffmann，numpy 的 flatten 也可以。正如函数名称所暗示的那样，展平将张量/ndarray 展平为一维数组。所以没有需要解决的歧义。而这里讨论的问题是单个维度的展平，例如 6-D 到 5-D 张量/ndarray。 Torch Reshape在这方面需要相同的规范。

Peter

对亚历山大的回答稍作概括 - np.reshape 可以将 -1 作为参数，意思是“数组总大小除以所有其他列出的维度的乘积”：

例如展平除最后一个维度之外的所有维度：

>>> arr = numpy.zeros((50,100,25))
>>> new_arr = arr.reshape(-1, arr.shape[-1])
>>> new_arr.shape
# (5000, 25)

KeithWM

对彼得的回答稍作概括——如果您想超越三维数组，您可以在原始数组的形状上指定一个范围。

例如展平除最后两个维度之外的所有维度：

arr = numpy.zeros((3, 4, 5, 6))
new_arr = arr.reshape(-1, *arr.shape[-2:])
new_arr.shape
# (12, 5, 6)

编辑：对我之前的回答稍作概括——当然，您也可以在重塑的开头指定一个范围：

arr = numpy.zeros((3, 4, 5, 6, 7, 8))
new_arr = arr.reshape(*arr.shape[:2], -1, *arr.shape[-2:])
new_arr.shape
# (3, 4, 30, 7, 8)

已经两年多了……我们需要再稍微概括一下！ ;)

kmario23

另一种方法是使用 numpy.resize()，如：

In [37]: shp = (50,100,25)
In [38]: arr = np.random.random_sample(shp)
In [45]: resized_arr = np.resize(arr, (np.prod(shp[:2]), shp[-1]))
In [46]: resized_arr.shape
Out[46]: (5000, 25)

# sanity check with other solutions
In [47]: resized = np.reshape(arr, (-1, shp[-1]))
In [48]: np.allclose(resized_arr, resized)
Out[48]: True

Sherman

numpy.vstack 非常适合这种情况

import numpy as np
arr = np.ones((50,100,25))
np.vstack(arr).shape
> (5000, 25)

我更喜欢使用 stack、vstack 或 hstack 而不是 reshape，因为 reshape 只是扫描数据并且似乎将其强制转换为所需的形状。如果您要取列平均值，这可能会出现问题。

这是我的意思的说明。假设我们有以下数组

>>> arr.shape
(2, 3, 4)
>>> arr 
array([[[1, 2, 3, 4],
        [1, 2, 3, 4],
        [1, 2, 3, 4]],

       [[7, 7, 7, 7],
        [7, 7, 7, 7],
        [7, 7, 7, 7]]])

我们应用这两种方法来获得一个形状数组 (3,8)

>>> arr.reshape((3,8)).shape
(3, 8)
>>> np.hstack(arr).shape 
(3, 8)

但是，如果我们看看它们在每种情况下是如何被重塑的，hstack 将允许我们获取我们也可以从原始数组中计算出来的列总和。使用 reshape 这是不可能的。

>>> arr.reshape((3,8))
array([[1, 2, 3, 4, 1, 2, 3, 4],
       [1, 2, 3, 4, 7, 7, 7, 7],
       [7, 7, 7, 7, 7, 7, 7, 7]])
>>> np.hstack(arr)
array([[1, 2, 3, 4, 7, 7, 7, 7],
       [1, 2, 3, 4, 7, 7, 7, 7],
       [1, 2, 3, 4, 7, 7, 7, 7]])

如何仅展平numpy数组的某些维度

关注公众号

想领先一步获取最新的外包任务吗？

相似问题

平台

支持

联系我们