numpy部分函数讲解

Numpy部分函数讲解

学习笔记，记录一些numpy中一些不太熟悉但是非常好用的API用法。

`numpy.lexsort()`

官方doc讲解:https://numpy.org/doc/stable/reference/generated/numpy.lexsort.html

v1.2.4版本为例

numpy.lexsort(keys, axis=-1)

Perform an indirect stable sort using a sequence of keys.

Given multiple sorting keys, which can be interpreted as columns in a spreadsheet, lexsort returns an array of integer indices that describes the sort order by multiple columns. The last key in the sequence is used for the primary sort order, the second-to-last key for the secondary sort order, and so on. The keys argument must be a sequence of objects that can be converted to arrays of the same shape. If a 2D array is provided for the keys argument, its rows are interpreted as the sorting keys and sorting is according to the last row, second last row etc.

Parameters:

keys(k, N) array or tuple containing k (N,)-shaped sequences. The k different “columns” to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.

axis(int, optional) Axis to be indirectly sorted. By default, sort over the last axis.

Returns:

indices(N,) ndarray of ints.Array of indices that sort the keys along the specified axis.

打个比方，类似于考试排名时，首先以总成绩排名，当总成绩相同时一次按照语文，数学，英语的分数顺序排名，实现了多个关键字的排序功能。例如

	语文	数学	英语
小明	92	96	100
小红	82	90	85
小丽	82	60	85
小亮	82	90	70

import numpy

a = np.array([92, 82, 82, 82]) # first column
b = np.array([96, 90, 60, 90]) # second column
c = np.array([100, 85, 85, 70]) # third column

np.lexsort((c, b, a)) # default, last column is the primary key

# result is array([2, 3, 1, 0])
# 小丽<小亮<小红<小明

# 等价写法
# 2D array 排序，相当于原始表格行列转置了
datasheet = np.array([c, b, a])
# array([[100,  85,  85,  70],
#       [ 96,  90,  60,  90],
#       [ 92,  82,  82,  82]])

np.lexsort(datasheet)

有一些需要注意的细节:

默认从小到大排序
在输入key这一项的时候，如果输入是多个一维数组(数组的长度必须相同)，那么primary key的优先顺序是最后一列，倒数第二列，……，第一列(所以顺序是(c,b,a)不是(a,b,c))。如果是2D array(比如作为一个datasheet输入)，那么顺序是最后一行，倒数第二行，……，第一行。

此外，程序里一维数组都是行向量，组成2D array时数学写法应该是 \(\left[ \begin{array}{c} c \\b \\ a \end{array} \right]\)，不过程序里np.array([c, b, a])只能横着写，按照以前的习惯容易误以为是四行三列了，实际是三行四列的。所以numpy官方很体贴的将2D array直接改为按行比较。

`numpy.insert()`

官方doc讲解:https://numpy.org/doc/stable/reference/generated/numpy.insert.html

直接看example

Examples 是否指定axis的区别

>>> a = np.array([[1, 1], [2, 2], [3, 3]])
>>> a
array([[1, 1],
       [2, 2],
       [3, 3]])
>>> np.insert(a, 1, 5)
array([1, 5, 1, ..., 2, 3, 3])
>>> np.insert(a, 1, 5, axis=1)
array([[1, 5, 1],
       [2, 5, 2],
       [3, 5, 3]])

Difference between sequence and scalars:

>>> np.insert(a, [1], [[1],[2],[3]], axis=1)
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])
>>> np.array_equal(np.insert(a, 1, [1, 2, 3], axis=1),
...                np.insert(a, [1], [[1],[2],[3]], axis=1))
True
>>> b = a.flatten()
>>> b
array([1, 1, 2, 2, 3, 3])
>>> np.insert(b, [2, 2], [5, 6])
array([1, 1, 5, ..., 2, 3, 3])
>>> np.insert(b, slice(2, 4), [5, 6])
array([1, 1, 5, ..., 2, 3, 3])
>>> np.insert(b, [2, 2], [7.13, False]) # type casting
array([1, 1, 7, ..., 2, 3, 3])
>>> x = np.arange(8).reshape(2, 4)
>>> idx = (1, 3)
>>> np.insert(x, idx, 999, axis=1)
array([[  0, 999,   1,   2, 999,   3],
       [  4, 999,   5,   6, 999,   7]])

`numpy.unique()`

https://numpy.org/doc/stable/reference/generated/numpy.unique.html#numpy-unique

返回一个数组去重后的结果

numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None, **, equal_nan=True*)[source]

Find the unique elements of an array.

Returns the sorted unique elements of an array. There are three optional outputs in addition to the unique elements:

the indices of the input array that give the unique values（注：第一次出现unique value的位置）

the indices of the unique array that reconstruct the input array

the number of times each unique value comes up in the input array

看example

1D 情况: 注意输入是多维array, 不指定axis的话会被拉平成一维。

>>> np.unique([1, 1, 2, 2, 3, 3], return_index=True, return_inverse=True, return_counts=True)
(array([1, 2, 3]), array([0, 2, 4]), array([0, 0, 1, 1, 2, 2]), array([2, 2, 2]))
>>> a = np.array([[1, 1], [2, 3]])
>>> np.unique(a)
array([1, 2, 3])

2D情况：

下面这个情况常见，例如3D点云Voxel稀疏化操作中，a数组是(N,3)代表记录着N个点三维坐标空间信息。原始点云坐标经过Voxel划分后，得到每个点所在的Voxel的下标索引。稀疏化操作就是只记录有点云存在的Voxel的下标位置，并只对这些位置进行几何变换的操作。

1
2
3

>>> a = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 4]])
>>> np.unique(a, axis=0, return_index=True, return_inverse=True, return_counts=True)
(array([[1, 0, 0], [2, 3, 4]]), array([0, 2]), array([0, 0, 1]), array([2, 1]))

numpy.unique()后重建原来数组

Reconstruct the input array from the unique values and inverse:

>>> a = np.array([1, 2, 6, 4, 2, 3, 2])
>>> u, indices = np.unique(a, return_inverse=True)
>>> u
array([1, 2, 3, 4, 6])
>>> indices
array([0, 1, 4, 3, 1, 2, 1])
>>> u[indices]
array([1, 2, 6, 4, 2, 3, 2])

同理2D:

>>> a = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 4]])
>>> u, indices = np.unique(a, axis=0, return_inverse=True)
>>> u
array([[1, 0, 0],
       [2, 3, 4]])
>>> indices
array([0, 0, 1])
>>> u[indices]
array([[1, 0, 0],
       [1, 0, 0],
       [2, 3, 4]])

编程学习 > numpy学习

#numpy

numpy部分函数讲解

https://oier99.cn/posts/5d51c93/

作者

Oier99

发布于

2023年1月6日

许可协议

probability_beginner 上一篇

Music Collection 下一篇