NumPy 라이브러리 함수 정리

안녕하세요 Gliver 입니다.

이번 글은 NumPy 라이브러리의 함수에 대해 정리해 놓은 글입니다.

[글을 읽기 전에 알아두면 좋은 점]

글에서 나오는 np는 numpy를 의미하고

ndarray는 numpy 라이브러리의 ndarray 객체를 의미합니다.

또한, np.arange(start, stop, step) 같은 경우는

start, step 은 선택 인자, stop은 필수 인자를 의미합니다.

글에서 "Create an array." 처럼

영어로 적은 부분은 numpy 라이브러리의 Docstring 에 나와있는 내용입니다.

a1 = np.arange(1, 10)
a2 = np.arange(1, 10).reshape(3, 3)

ㆍ np.ndarray.ndim : Number of array dimensions.
ㆍ np.ndarray.dtype : Data-type of the array's elements.
ㆍ np.ndarray.size : Number of elements in the array.
ㆍ np.ndarray.itemsize : Length of one array element in bytes.
ㆍ np.ndarray.nbytes : Total bytes consumed by the elements of the array.
ㆍ np.ndarray.strides : Tuple of bytes to step in each dimension when traversing an array.

[Indexing]

ㆍ ndarray[index]
ㆍ ndarray[2]

ㆍ ndarray[2][3] # ndarray[2, 3]

ㆍ ndarray[2][-1] # ndarray[2, -1]

[Slicing]

ㆍ ndarray[start : stop : step]

ㆍ 기본값 : start = 0, stop = 끝, step = 1

ㆍ ndarray[:2] # ndarray[0:2]

ㆍ ndarray[:] # ndarray[0:]

ㆍ ndarray[::2] #ndarray[0::2]

ㆍ ndarray[:][1:] #ndarray[0:][1:]

[Boolean Indexing]

ㆍndarray[boolean list] : ndarray에서 boolean list에서 True인 부분을 뷰(View)로 반환함. (얕은복사)

[Fancy Indexing]

ㆍ여러 배열 요소에 한 번에 액세스하기 위해 일련의 인덱스를 전달하는 것을 의미

ㆍ ndarray[list_1, list_2, ..., list_n] : ndarray가 n차원일때 list_n까지 인자로 전달 가능

4. 배열 원소 삽입/수정/삭제/복사

ndim ≥ 2
ㆍ axis = 0 # axis = rows
ㆍ axis = 1 # axis = columns

배열의 axis(축) 개념 이해하기

[배열 원소 삽입]

ㆍ np.insert(array, obj, values) : 배열의 특정 위치에 값을 삽입하여 ndarray 형태로 반환

ㆍ Insert values along the given axis before the given indices.

ㆍ obj : Object that defines the index or indices before which `values` is inserted. (int | sequence of ints)

ㆍ axis = None : axis를 지정하지 않으면 1차원 배열로 취급함.

ㆍ np.insert(a1, 0, 101)

ㆍ np.insert(a2, 0, 101, axis=0) # axis is rows

ㆍ np.insert(a2, [0, 1, 2], 101, axis=1) # axis is columns

[배열 원소 수정]

ㆍ 배열의 조회(Indexing, Slicing, Boolean Indexing, Fancy Indexing)를 이용하여 값을 수정하면 됨.

ㆍ ndarray[2] = 101

ㆍ ndarray[:][1:] = 101

[배열 원소 삭제]

ㆍ np.delete(arr, obj) : 배열의 특정 위치의 값을 제거하여 ndarray 형태로 반환

ㆍ Return a new array with sub-arrays along an axis deleted.

ㆍ obj : Indicate indices of sub-arrays to remove along the specified axis. (int | sequence of ints)

ㆍ axis = None : axis를 지정하지 않으면 1차원 배열로 취급함.

ㆍ np.delete(a1, 0)

ㆍ np.delete(a2, 0, axis=0) # axis is rows

ㆍ np.delete(a2, [0, 1, 2], axis=1) # axis is columns

[배열 복사]

배열의 조회(Indexing, Slicing, Boolean Indexing, Fancy Indexing)
를 이용하여 복사할 경우 뷰(View)를 반환함. (얕은복사)
뷰(View)형태로 받은 배열의 변수의 값을 수정하면 실제 배열의 값도 바뀜.

ㆍndarray.copy() : 배열이나 하위 배열 내의 값을 명시적으로 복사하여 ndarray 형태로 반환함. (깊은복사)

ㆍ Return a copy of the array.

5. 배열 변환

[배열 전치 및 축 변경]

ㆍ ndarray.T : 배열의 전치행렬을 뷰(View)로 반환함. (얕은복사)

ㆍ The transposed array.

ㆍ np.transpose(arr, axes) : 배열의 전치행렬을 ndarray 형태로 반환함. (깊은복사)

ㆍ Reverse or permute the axes of an array; returns the modified array.

ㆍ ndarray.swapaxes(axis1, axis2) : axis1 과 axis2를 바꾼 뷰(View)를 반환함. (얕은복사)

ㆍ Return a view of the array with `axis1` and `axis2` interchanged.

[배열 재구조화]

ㆍ ndarray.reshape(shape) : ndarray를 'shape' 형태로 바꾼 뷰(View)를 반환함. (얕은복사)

ㆍ Returns an array containing the same data with a new shape.

ㆍ ndarray.resize(new_shape, refcheck) : 내부적으로 ndarray를 'new_shape' 형태로 변환시킴. (반환값 x)

ㆍ Change shape and size of array in-place.

ㆍ refcheck : If False, reference count will not be checked. (bool)

ㆍ refcheck = True

ㆍ ndarray.resize((3, 3))

ㆍ ndarray.resize((5, 5), refcheck=False)

[배열 추가&연결]

ㆍ np.append(arr, values, axis) : 'arr' 배열의 끝에 'values' 을 추가하여 ndarray 형태로 반환

ㆍ Append values to the end of an array.

ㆍ axis = None : axis를 지정하지 않으면 1차원 배열로 취급함.

ㆍ np.append(a1, 3)

ㆍ np.append(a1, b1)

ㆍ np.append(a2, b2)

ㆍ np.append(a2, b2, axis=0)

ㆍ np.append(a2, b2, axis=1)

ㆍ 단, 배열들의 차원(ndim)이 같아야 함.

ㆍ np.concatenate(arrs, axis) : 'arrs' 에 속하는 배열들을 'axis' 에 따라 연결시켜 ndarray 형태로 반환

ㆍ Join a sequence of arrays along an existing axis.

ㆍ arrs : sequence of array_like.

ㆍ axis = 0

ㆍ np.concatenate([a1, b1, c1])

ㆍ np.concatenate([a2, b2, c2])

ㆍ np.concatenate([a2, b2, c2], axis=1)

ㆍ 단, 배열들의 차원(ndim)이 같아야 함.

ㆍ np.vstack(arrs) : 'arrs' 에 속하는 배열들을 row를 기준으로 연결시켜 ndarray 형태로 반환

ㆍ Stack arrays in sequence vertically (row wise).

ㆍ np.hstack(arrs) : 'arrs' 에 속하는 배열들을 column을 기준으로 연결시켜 ndarray 형태로 반환

ㆍ Stack arrays in sequence horizontally (column wise).

ㆍ np.dstack(arrs) : Stack arrays in sequence depth wise (along third axis).

[배열 분할]

ㆍ np.split(arr, indices_or_sections, axis) : 'arr' 을 부분 배열로 분할한 결과를 여러 개의 뷰(View)를 반환 (얕은복사)

ㆍ Split an array into multiple sub-arrays as views into `ary`.

ㆍ indices_or_sections : int or 1-D array

ㆍIf `indices_or_sections` is an integer, N, the array will be divided into N equal arrays along `axis`.

ㆍIf `indices_or_sections` is a 1-D array of sorted integers, the entries indicate where along `axis` the array is split.

ㆍ axis = 0

ㆍ np.split(a1, 5) # 반환값 5개

ㆍ np.split(a1, [3]) # 반환값 2개

ㆍ np.split(a1, [2, 4, 5, 8]) # 반환값 5개

ㆍ np.split(a2, [1, 2]) # 반환값 3개

ㆍ np.split(a2, [1, 2], axis=1) # 반환값 3개

ㆍ np.vsplit(arr, indices_or_sections) : Split an array into multiple sub-arrays vertically (row-wise).

ㆍ np.hsplit(arr, indices_or_sections) : Split an array into multiple sub-arrays horizontally (column-wise).

ㆍ np.dsplit(arr, indices_or_sections) : Split array into multiple sub-arrays along the 3rd axis (depth).

6. 배열 연산

ㆍ NumPy의 배열 연산은 벡터화(vectorized) 연산을 사용함. (브로드캐스팅)
ㆍ 배열 요소에 대한 반복적인 계산을 효율적으로 수행함. (배열간의 연산을 통해서)

[산술 연산]

ㆍ np.add() # a + b

ㆍ np.subtract() # a - b

ㆍ np.multiply() # a * b

ㆍ np.divide() # a / b

ㆍ np.floor_divide() # a // b

ㆍ np.mod() # a % b

ㆍ np.power() # a ** b

ㆍ np.negative() # -a

[절대값 함수]

ㆍ np.absolute() # np.abs()

[제곱/제곱근 함수]

ㆍ np.square() # 제곱

ㆍ np.sqrt() # 제곱근

[지수와 로그 함수]

ㆍ np.exp() # 밑이 e인 지수

ㆍ np.exp2() # 밑이 2인 지수

ㆍ np.log() # 밑이 e인 로그

ㆍ np.log2() # 밑이 2인 로그

ㆍ np.log10() # 밑이 10인 로그

[삼각 함수]

ㆍ np.sin() # sin 함수

ㆍ np.cos() # cos 함수

ㆍ np.tan() # tan 함수

ㆍ np.arcsin() # arcsin 함수

ㆍ np.arccos() # arccos 함수

ㆍ np.arctan() # arctan 함수

더 많은 함수를 알아보기

[집계 함수]

ㆍ np.sum(arr, axis) : 합

ㆍSum of array elements over a given axis.

ㆍ np.cumsum(arr, axis) : 누적합

ㆍ Return the cumulative sum of the elements along a given axis.

ㆍ np.prod(arr, axis) : 곱

ㆍ Return the product of array elements over a given axis.

ㆍ np.cumprod(arr, axis) : 누적곱

ㆍ Return the cumulative product of elements along a given axis.

ㆍ np.diff(arr, axis) : 두 원소간의 차이

ㆍ Calculate the n-th discrete difference along the given axis.

ㆍ np.dot(arr1, arr2) : 점곱

ㆍ Dot product of two arrays.

ㆍ np.cross(arr1, arr2) : 벡터곱

ㆍ Return the cross product of two (arrays of) vectors.

ㆍ np.inner(arr1, arr2) : 내적

ㆍ Inner product of two arrays.

ㆍ np.outer(arr1, arr2) : 외적

ㆍ Compute the outer product of two vectors.

ㆍ np.mean(arr, axis) : 평균

ㆍ Compute the arithmetic mean along the specified axis.

ㆍ np.std(arr, axis) : 표준 편차

ㆍ Compute the standard deviation along the specified axis.

ㆍ np.var(arr, axis) : 분산

ㆍ Compute the variance along the specified axis.

ㆍ np.min(arr, axis) : 최소값

ㆍ Return the minimum of an array or minimum along an axis.

ㆍ np.max(arr, axis) : 최대값

ㆍ Return the maximum of an array or minimum along an axis.

ㆍ np.argmin(arr, axis) : 최소값 인덱스

ㆍ Returns the indices of the minimum values along an axis.

ㆍ np.argmax(arr, axis) : 최대값 인덱스

ㆍ Returns the indices of the maximum values along an axis.

ㆍ np.median(arr, axis) : 중앙값

ㆍ Compute the median along the specified axis.

ㆍ np.any(arr, axis) : OR 연산

ㆍ Test whether any array element along a given axis evaluates to True.

ㆍ np.all(arr, axis) : AND 연산

ㆍ Test whether all array elements along a given axis evaluate to True.

[ndarray와 연산자]

ㆍ ndarray와 비교연산자 사용

ㆍ ndarray의 각 원소에 대해 비교연산을 수행해서 dtype은 bool이고, 배열의 shape은 ndarray인 배열 반환

ㆍ이를 이용해서 Boolean Indexing 사용 가능

ㆍ ndarray와 불리언연산자 사용

ㆍ ndarray의 각 원소에 대해 불리언연산을 수행해서 dtype은 bool이고, 배열의 shape은 ndarray인 배열 반환

ㆍ이를 이용해서 Boolean Indexing 사용 가능

ㆍ np.nan : Not a Number을 의미하는 float형 객체
ㆍ np.inf : ∞를 의미하는 float형 객체
ㆍ np.NINF : -∞를 의미하는 float형 객체

ㆍ np.isclose(arr1, arr2) : 두 배열의 각 원소가 같은지를 반환

ㆍ Returns a boolean array where two arrays are element-wise equal within a tolerance.

ㆍ 단, arr1과 arr2의 shape가 같아야 함.

ㆍ np.isnan(arr) : 배열의 각 원소가 NaN(Not a Number)인지를 반환
ㆍ Test element-wise for NaN and return result as a boolean array.

ㆍ np.isinf(arr) : 배열의 각 원소가 ∞인지를 반환

ㆍ Test element-wise for positive or negative infinity.

ㆍ np.isfinite(arr) : 배열의 각 원소가 유한한지를 반환 (유한하다 = "∞와 NaN을 제외한 것들")

ㆍ Test element-wise for finiteness (not infinity or not Not a Number).

[배열 정렬]

ㆍ np.sort(arr, axis) : 'axis'를 기준으로 'arr'을 정렬하여 ndarray 형태로 반환

ㆍ Return a sorted copy of an array.

ㆍ np.argsort(arr, axis) : 'axis'를 기준으로 'arr'을 정렬했을 때의 index를 ndarray 형태로 반환

ㆍ Returns the indices that would sort an array.

ㆍ ndarray.sort(axis) : 'axis'를 기준으로 내부적으로 ndarray를 정렬시킴. (반환값 x)

ㆍ Sort an array in-place. Refer to `numpy.sort` for full documentation.

궁금한 점이나 코멘트는 댓글로 남겨주세요!