๐ก
1.์์๋ธ ์ฌ์ฉํ์ง ์๊ณ ์์ฌ๊ฒฐ์ ํธ๋ฆฌ ๋จ๋
๋ชจ๋ธ ์์ฑ
2. ์์๋ธ ๋ฐฐ๊น
์ผ๋ก ๋ชจ๋ธ ์์ฑ
3. ์์๋ธ ๋ถ์คํ
์ผ๋ก ๋ชจ๋ธ ์์ฑ
1. ์์๋ธ ์ฌ์ฉํ์ง ์๋ ๊ฒฝ์ฐ
- ์ ํ๋ = 100
- ์ด๋ฏธ ์ฐ์ํ ๋ชจ๋ธ์ด๊ธฐ์ ์ ํ๋๊ฐ 100์ด ๋์๋ค.
๐ก ์์
1. ๋ฐ์ดํฐ ๋ก๋
2. ํ๋ จ, ํ
์คํธ ๋ฐ์ดํฐ ๋ถ๋ฆฌ
3. ์ ๊ทํ
4. ๋จ๋
์์ฌ๊ฒฐ์ ํธ๋ฆฌ ๋ชจ๋ธ ์์ฑ
5. ๋ชจ๋ธ ํ๋ จ
6. ๋ชจ๋ธ ์์ธก
7. ๋ชจ๋ธ ํ๊ฐ
# ์์
# 1. ๋ฐ์ดํฐ ๋ก๋
import pandas as pd
iris = pd.read_csv("d:\\data\\iris2.csv")
iris
# 2. ํ๋ จ, ํ
์คํธ ๋ฐ์ดํฐ ๋ถ๋ฆฌ
x = iris.iloc[:,0:4]
y = iris['Species']
from sklearn.model_selection import train_test_split
x_train,x_test, y_train, y_test = train_test_split(x,y, test_size = 0.1,random_state =1 )
x_train.shape, x_test.shape, y_train.shape, y_test.shape
# ((135, 4), (15, 4), (135,), (15,))
# 3. ์ ๊ทํ
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
scaler.fit(x_train)
x_train2 = scaler.transform(x_train)
x_test2= scaler.transform(x_test)
# 4. ๋จ๋
์์ฌ๊ฒฐ์ ํธ๋ฆฌ ๋ชจ๋ธ ์์ฑ
from sklearn.tree import DecisionTreeClassifier
model = DecisionTreeClassifier(criterion = 'entropy', max_depth = 20, random_state = 2)
# 5. ๋ชจ๋ธ ํ๋ จ
model.fit(x_train2, y_train)
# 6. ๋ชจ๋ธ ์์ธก
result = model.predict(x_test2)
# 7. ๋ชจ๋ธ ํ๊ฐ
accuracy = sum(result == y_test)/ len(y_test)
print(accuracy) #์ ํ๋ =100
2. ์์๋ธ - ๋ฐฐ๊น ⇒ ์ ํ๋ =100
# ์์
# 1. ๋ฐ์ดํฐ ๋ก๋
import pandas as pd
iris = pd.read_csv("d:\\data\\iris2.csv")
iris
# 2. ํ๋ จ, ํ
์คํธ ๋ฐ์ดํฐ ๋ถ๋ฆฌ
x = iris.iloc[:,0:4]
y = iris['Species']
from sklearn.model_selection import train_test_split
x_train,x_test, y_train, y_test = train_test_split(x,y, test_size = 0.1,random_state =1 )
x_train.shape, x_test.shape, y_train.shape, y_test.shape
# ((135, 4), (15, 4), (135,), (15,))
# 3. ์ ๊ทํ
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
scaler.fit(x_train)
x_train2 = scaler.transform(x_train)
x_test2= scaler.transform(x_test)
#4. โ
โ
์์๋ธ ์ฌ์ฉ(๋ฐฐ๊น
๋ชจ๋ธ ๋ง๋ค๊ธฐ)
from sklearn.tree import DecisionTreeClassifier
model2 = DecisionTreeClassifier(criterion = 'entropy', max_depth = 20, random_state = 2)
from sklearn.ensemble import BaggingClassifier
bagging_model = BaggingClassifier(model2, max_samples = 0.9,n_estimators = 25, random_state = 1)
# โป max_samples = 0.9 ๋ bag์ ๋ฐ์ดํฐ๋ฅผ ๋ด์ ๋, ํ๋ จ ๋ฐ์ดํฐ์ 90%๋ฅผ ์ํ๋งํ๊ฒ ๋ค.
# n_estimators = 25 : ์์ฌ๊ฒฐ์ ํธ๋ฆฌ ๋ชจ๋ธ์ 25๊ฐ ๋ง๋ค๊ฒ ๋ค.
#5.๋ชจ๋ธ ํ๋ จ
bagging_model.fit(x_train2,y_train)
#6.๋ชจ๋ธ ์์ธก, ํ๊ฐ
result2 = bagging_model.predict(x_test2)
print(sum(result2 == y_test)/len(y_test))
# ์ ํ๋ = 100%
3. ์์๋ธ - ๋ถ์คํ ⇒ ์ ํ๋ =100
โ kaggle์์ ์ฐ์นํ ๋ถ์คํ (XGboost)
# ์์
# 1. ๋ฐ์ดํฐ ๋ก๋
import pandas as pd
iris = pd.read_csv("d:\\data\\iris2.csv")
iris
# 2. ํ๋ จ, ํ
์คํธ ๋ฐ์ดํฐ ๋ถ๋ฆฌ
x = iris.iloc[:,0:4]
y = iris['Species']
from sklearn.model_selection import train_test_split
x_train,x_test, y_train, y_test = train_test_split(x,y, test_size = 0.1,random_state =1 )
x_train.shape, x_test.shape, y_train.shape, y_test.shape
# ((135, 4), (15, 4), (135,), (15,))
# 3. ์ ๊ทํ
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
scaler.fit(x_train)
x_train2 = scaler.transform(x_train)
x_test2= scaler.transform(x_test)
#4. โ
โ
์์๋ธ ์ฌ์ฉ(๋ถ์คํ
๋ชจ๋ธ ๋ง๋ค๊ธฐ)
from sklearn.ensemble import GradientBoostingClassifier
model3 = GradientBoostingClassifier(n_estimators = 300, random_state = 2)
#5.๋ชจ๋ธ ํ๋ จ
model3.fit(x_train2,y_train)
#6.๋ชจ๋ธ ์์ธก, ํ๊ฐ
result3 = model3.predict(x_test2)
print(sum(result3 == y_test)/len(y_test))
# ์ ํ๋ = 100%
'Data Analysis > ML,DL' ์นดํ ๊ณ ๋ฆฌ์ ๋ค๋ฅธ ๊ธ
R์ ํ์ฉํ ์ฌํ์ ์ฐ๊ฒฐ๋ง ์๊ฐํ (0) | 2022.08.30 |
---|---|
์ธ๊ณต์ ๊ฒฝ๋ง ํ์ด์ฌ์ผ๋ก ๊ตฌํํ๊ธฐ (0) | 2022.08.30 |