numpy部分函数讲解

Numpy部分函数讲解

学习笔记,记录一些numpy中一些不太熟悉但是非常好用的API用法。

numpy.lexsort()

官方doc讲解:https://numpy.org/doc/stable/reference/generated/numpy.lexsort.html

v1.2.4版本为例

numpy.lexsort(keys, axis=-1)

Perform an indirect stable sort using a sequence of keys.

Given multiple sorting keys, which can be interpreted as columns in a spreadsheet, lexsort returns an array of integer indices that describes the sort order by multiple columns. The last key in the sequence is used for the primary sort order, the second-to-last key for the secondary sort order, and so on. The keys argument must be a sequence of objects that can be converted to arrays of the same shape. If a 2D array is provided for the keys argument, its rows are interpreted as the sorting keys and sorting is according to the last row, second last row etc.

  • Parameters:

keys(k, N) array or tuple containing k (N,)-shaped sequences. The k different “columns” to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.

axis(int, optional) Axis to be indirectly sorted. By default, sort over the last axis.

  • Returns:

indices(N,) ndarray of ints.Array of indices that sort the keys along the specified axis.

打个比方,类似于考试排名时,首先以总成绩排名,当总成绩相同时一次按照语文,数学,英语的分数顺序排名,实现了多个关键字的排序功能。例如

语文 数学 英语
小明 92 96 100
小红 82 90 85
小丽 82 60 85
小亮 82 90 70
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import numpy

a = np.array([92, 82, 82, 82]) # first column
b = np.array([96, 90, 60, 90]) # second column
c = np.array([100, 85, 85, 70]) # third column

np.lexsort((c, b, a)) # default, last column is the primary key

# result is array([2, 3, 1, 0])
# 小丽<小亮<小红<小明

# 等价写法
# 2D array 排序,相当于原始表格行列转置了
datasheet = np.array([c, b, a])
# array([[100, 85, 85, 70],
# [ 96, 90, 60, 90],
# [ 92, 82, 82, 82]])

np.lexsort(datasheet)

有一些需要注意的细节:

  • 默认从小到大排序
  • 在输入key这一项的时候,如果输入是多个一维数组(数组的长度必须相同),那么primary key的优先顺序是最后一列,倒数第二列,……,第一列(所以顺序是(c,b,a)不是(a,b,c))。如果是2D array(比如作为一个datasheet输入),那么顺序是最后一行,倒数第二行,……,第一行。

此外,程序里一维数组都是行向量,组成2D array时数学写法应该是 \(\left[ \begin{array}{c} c \\b \\ a \end{array} \right]\),不过程序里np.array([c, b, a])只能横着写,按照以前的习惯容易误以为是四行三列了,实际是三行四列的。所以numpy官方很体贴的将2D array直接改为按行比较。

numpy.insert()

官方doc讲解:https://numpy.org/doc/stable/reference/generated/numpy.insert.html

直接看example

Examples 是否指定axis的区别

1
2
3
4
5
6
7
8
9
10
11
>>> a = np.array([[1, 1], [2, 2], [3, 3]])
>>> a
array([[1, 1],
[2, 2],
[3, 3]])
>>> np.insert(a, 1, 5)
array([1, 5, 1, ..., 2, 3, 3])
>>> np.insert(a, 1, 5, axis=1)
array([[1, 5, 1],
[2, 5, 2],
[3, 5, 3]])

Difference between sequence and scalars:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
>>> np.insert(a, [1], [[1],[2],[3]], axis=1)
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
>>> np.array_equal(np.insert(a, 1, [1, 2, 3], axis=1),
... np.insert(a, [1], [[1],[2],[3]], axis=1))
True
>>> b = a.flatten()
>>> b
array([1, 1, 2, 2, 3, 3])
>>> np.insert(b, [2, 2], [5, 6])
array([1, 1, 5, ..., 2, 3, 3])
>>> np.insert(b, slice(2, 4), [5, 6])
array([1, 1, 5, ..., 2, 3, 3])
>>> np.insert(b, [2, 2], [7.13, False]) # type casting
array([1, 1, 7, ..., 2, 3, 3])
>>> x = np.arange(8).reshape(2, 4)
>>> idx = (1, 3)
>>> np.insert(x, idx, 999, axis=1)
array([[ 0, 999, 1, 2, 999, 3],
[ 4, 999, 5, 6, 999, 7]])

numpy.unique()

https://numpy.org/doc/stable/reference/generated/numpy.unique.html#numpy-unique

返回一个数组去重后的结果

numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None, **, equal_nan=True*)[source]

Find the unique elements of an array.

Returns the sorted unique elements of an array. There are three optional outputs in addition to the unique elements:

  • the indices of the input array that give the unique values(注:第一次出现unique value的位置)
  • the indices of the unique array that reconstruct the input array
  • the number of times each unique value comes up in the input array

看example

1D 情况: 注意输入是多维array, 不指定axis的话会被拉平成一维。

1
2
3
4
5
>>> np.unique([1, 1, 2, 2, 3, 3], return_index=True, return_inverse=True, return_counts=True)
(array([1, 2, 3]), array([0, 2, 4]), array([0, 0, 1, 1, 2, 2]), array([2, 2, 2]))
>>> a = np.array([[1, 1], [2, 3]])
>>> np.unique(a)
array([1, 2, 3])

2D情况:

下面这个情况常见,例如3D点云Voxel稀疏化操作中,a数组是(N,3)代表记录着N个点三维坐标空间信息。原始点云坐标经过Voxel划分后,得到每个点所在的Voxel的下标索引。稀疏化操作就是只记录有点云存在的Voxel的下标位置,并只对这些位置进行几何变换的操作。

1
2
3
>>> a = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 4]])
>>> np.unique(a, axis=0, return_index=True, return_inverse=True, return_counts=True)
(array([[1, 0, 0], [2, 3, 4]]), array([0, 2]), array([0, 0, 1]), array([2, 1]))

numpy.unique()后重建原来数组

Reconstruct the input array from the unique values and inverse:

1
2
3
4
5
6
7
8
>>> a = np.array([1, 2, 6, 4, 2, 3, 2])
>>> u, indices = np.unique(a, return_inverse=True)
>>> u
array([1, 2, 3, 4, 6])
>>> indices
array([0, 1, 4, 3, 1, 2, 1])
>>> u[indices]
array([1, 2, 6, 4, 2, 3, 2])

同理2D:

1
2
3
4
5
6
7
8
9
10
11
>>> a = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 4]])
>>> u, indices = np.unique(a, axis=0, return_inverse=True)
>>> u
array([[1, 0, 0],
[2, 3, 4]])
>>> indices
array([0, 0, 1])
>>> u[indices]
array([[1, 0, 0],
[1, 0, 0],
[2, 3, 4]])

numpy部分函数讲解
https://oier99.cn/posts/5d51c93/
作者
Oier99
发布于
2023年1月6日
许可协议