* 네이버 AI 엔지니어 부스트 클래스 수강 내용을 참고하여 작성
Numpy, Pandas 기초문제
1. 행렬곱 연산
>>> import numpy as np
>>> arr1 = np.random.rand(5,3)
>>> arr2 = np.random.rand(3,2)
>>> arr1 @ arr2
array([[0.30803948, 0.94545996],
[0.22873815, 0.3066217 ],
[0.33170786, 0.60242841],
[0.3039172 , 0.5035964 ],
[0.28638591, 0.98754071]])
2. concatenate 연산
>>> import numpy as np
>>> arr1 = [[5,7], [9,11]]
>>> arr2 = [[2,4], [6,8]]
>>> print(np.concatenate([arr1, arr2], axis=0))
[[ 5 7]
[ 9 11]
[ 2 4]
[ 6 8]]
>>> print(np.concatenate([arr1, arr2], axis=1))
[[ 5 7 2 4]
[ 9 11 6 8]]
3. pandas - 딕셔너리, Series
>>> import pandas as pd
>>> idx = ["HDD", "SSD", "USB", "CLOUD"]
>>> data = [19, 11, 5, 97]
>>> dic = dict(zip(idx, data))
>>> series = pd.Series(dic)
>>> filtered_series = series[(series >= 10) & (series <= 20)]
>>> filtered_series
HDD 19
SSD 11
4. pandas - 표만들기 및 데이터 추출
>>> import pandas as pd
>>> df1 = {
... 'Name' : ['cherry','mango','potato','onion'],
... 'Type' : ['fruit','fruit','vegetable','vegetable'],
... 'Price' : [100,110,60,80]
... }
>>> df2 = {
... 'Name': ['pepper','carrot','banana','kiwi'],
... 'Type': ['vegetable','vegetable','fruit','fruit'],
... 'Price': [50,70,90,120]
>>> df1 = pd.DataFrame(df1)
>>> df2 = pd.DataFrame(df2)
>>> df = pd.concat([df1, df2], axis=0)
>>> df.sort_values(by= 'Type', inplace=True)
>>> df.reset_index(drop=True, inplace=True)
>>> df
Name Type Price
0 cherry fruit 100
1 mango fruit 110
2 banana fruit 90
3 kiwi fruit 120
4 potato vegetable 60
5 onion vegetable 80
6 pepper vegetable 50
7 carrot vegetable 70
>>> max_fruit_price = df.loc[df['Type'] == 'fruit', 'Price'].max()
>>> max_vegetable_price = df.loc[df['Type'] == 'vegetable', 'Price'].max()
>>> print(max_fruit_price + max_vegetable_price)
200
5. pandas - 데이터프레임, describe()
>>> import pandas as pd
>>> df= {
... 'sue':[55, 65, 60, 66, 57],
... 'ryan':[64, 77, 71, 79, 67],
... 'jay':[88, 81, 79, 89, 77],
... 'jane':[45, 35, 30, 46, 47],
... 'anna':[91, 96, 90, 97, 99]
... }
>>> df = pd.DataFrame(df)
>>> df.columns = ['round_1', 'round_2', 'round_3', 'round_4', 'round_5']
>>> df
round_1 round_2 round_3 round_4 round_5
0 55 64 88 45 91
1 65 77 81 35 96
2 60 71 79 30 90
3 66 79 89 46 97
4 57 67 77 47 99
>>> df.describe().loc[['mean','max','min']]
round_1 round_2 round_3 round_4 round_5
mean 60.6 71.6 82.8 40.6 94.6
max 66.0 79.0 89.0 47.0 99.0
min 55.0 64.0 77.0 30.0 90.0
'ML study' 카테고리의 다른 글
[네이버AI class] 3주차 (4) 딥러닝 학습 원리 (0) | 2024.05.20 |
---|---|
[네이버AI class] 3주차 (3) - 경사하강법 (1) | 2024.05.20 |
[네이버AI class] 3주차 (2) - 행렬 (0) | 2024.05.17 |
[네이버AI class] 3주차 (1) - 벡터 (0) | 2024.05.16 |
[MLstudy] 1. 피처 엔지니어링 - 1) 피처 정규화 (0) | 2024.04.24 |