steadilee
FRACTAL
steadilee
์ „์ฒด ๋ฐฉ๋ฌธ์ž
์˜ค๋Š˜
์–ด์ œ
  • ๋ถ„๋ฅ˜ ์ „์ฒด๋ณด๊ธฐ (47)
    • Knowledge (0)
      • ์ปดํ“จํ„ฐ (0)
    • Data Analysis (7)
      • ๊ณต๋ถ€ (4)
      • Simple Analysis (0)
      • ML,DL (3)
      • Kaggle (0)
    • Skill (40)
      • SQL ์ด๋ก  (3)
      • Oracle (18)
      • MySQL (0)
      • Python (5)
      • Linux (2)
      • C,C#,C++ (8)
      • Java (1)

๋ธ”๋กœ๊ทธ ๋ฉ”๋‰ด

  • ํ™ˆ
  • ํƒœ๊ทธ
  • ๋ฐฉ๋ช…๋ก

๊ณต์ง€์‚ฌํ•ญ

์ธ๊ธฐ ๊ธ€

ํƒœ๊ทธ

์ตœ๊ทผ ๋Œ“๊ธ€

์ตœ๊ทผ ๊ธ€

ํ‹ฐ์Šคํ† ๋ฆฌ

hELLO ยท Designed By ์ •์ƒ์šฐ.
steadilee

FRACTAL

Data Analysis/ML,DL

ํŒŒ์ด์ฌ์œผ๋กœ ์•™์ƒ๋ธ” ๊ตฌํ˜„ํ•˜๊ธฐ

2022. 8. 29. 23:53
๐Ÿ’ก
1.์•™์ƒ๋ธ” ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ  ์˜์‚ฌ๊ฒฐ์ •ํŠธ๋ฆฌ ๋‹จ๋… ๋ชจ๋ธ ์ƒ์„ฑ
2. ์•™์ƒ๋ธ” ๋ฐฐ๊น…์œผ๋กœ ๋ชจ๋ธ ์ƒ์„ฑ
3. ์•™์ƒ๋ธ” ๋ถ€์ŠคํŒ…์œผ๋กœ ๋ชจ๋ธ ์ƒ์„ฑ

 

1. ์•™์ƒ๋ธ” ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ

  • ์ •ํ™•๋„ = 100
  • ์ด๋ฏธ ์šฐ์ˆ˜ํ•œ ๋ชจ๋ธ์ด๊ธฐ์— ์ •ํ™•๋„๊ฐ€ 100์ด ๋‚˜์™”๋‹ค.
๐Ÿ’ก ์ˆœ์„œ
1. ๋ฐ์ดํ„ฐ ๋กœ๋“œ
2. ํ›ˆ๋ จ, ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ๋ถ„๋ฆฌ
3. ์ •๊ทœํ™”
4. ๋‹จ๋… ์˜์‚ฌ๊ฒฐ์ •ํŠธ๋ฆฌ ๋ชจ๋ธ ์ƒ์„ฑ
5. ๋ชจ๋ธ ํ›ˆ๋ จ
6. ๋ชจ๋ธ ์˜ˆ์ธก
7. ๋ชจ๋ธ ํ‰๊ฐ€
# ์ˆœ์„œ
# 1. ๋ฐ์ดํ„ฐ ๋กœ๋“œ
import pandas as pd

iris = pd.read_csv("d:\\data\\iris2.csv")
iris


# 2. ํ›ˆ๋ จ, ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ๋ถ„๋ฆฌ

x = iris.iloc[:,0:4]
y = iris['Species']

from sklearn.model_selection import train_test_split

x_train,x_test, y_train, y_test = train_test_split(x,y, test_size = 0.1,random_state =1 )

x_train.shape, x_test.shape, y_train.shape, y_test.shape
# ((135, 4), (15, 4), (135,), (15,))

# 3. ์ •๊ทœํ™”
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
scaler.fit(x_train)

x_train2 = scaler.transform(x_train)
x_test2= scaler.transform(x_test)

# 4. ๋‹จ๋… ์˜์‚ฌ๊ฒฐ์ •ํŠธ๋ฆฌ ๋ชจ๋ธ ์ƒ์„ฑ
from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier(criterion = 'entropy', max_depth = 20, random_state = 2)


# 5. ๋ชจ๋ธ ํ›ˆ๋ จ
model.fit(x_train2, y_train)

# 6. ๋ชจ๋ธ ์˜ˆ์ธก
result = model.predict(x_test2)

# 7. ๋ชจ๋ธ ํ‰๊ฐ€
accuracy = sum(result == y_test)/ len(y_test)
print(accuracy)                               #์ •ํ™•๋„ =100

 

 

2. ์•™์ƒ๋ธ” - ๋ฐฐ๊น… ⇒ ์ •ํ™•๋„ =100

# ์ˆœ์„œ
# 1. ๋ฐ์ดํ„ฐ ๋กœ๋“œ
import pandas as pd

iris = pd.read_csv("d:\\data\\iris2.csv")
iris


# 2. ํ›ˆ๋ จ, ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ๋ถ„๋ฆฌ

x = iris.iloc[:,0:4]
y = iris['Species']

from sklearn.model_selection import train_test_split

x_train,x_test, y_train, y_test = train_test_split(x,y, test_size = 0.1,random_state =1 )

x_train.shape, x_test.shape, y_train.shape, y_test.shape
# ((135, 4), (15, 4), (135,), (15,))

# 3. ์ •๊ทœํ™”
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
scaler.fit(x_train)

x_train2 = scaler.transform(x_train)
x_test2= scaler.transform(x_test)

#4. โ˜…โ˜… ์•™์ƒ๋ธ” ์‚ฌ์šฉ(๋ฐฐ๊น… ๋ชจ๋ธ ๋งŒ๋“ค๊ธฐ)
from sklearn.tree import DecisionTreeClassifier
model2 = DecisionTreeClassifier(criterion = 'entropy', max_depth = 20, random_state = 2)

from sklearn.ensemble import BaggingClassifier
bagging_model = BaggingClassifier(model2, max_samples = 0.9,n_estimators = 25, random_state = 1)

# โ€ป max_samples = 0.9 ๋Š” bag์— ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ด์„ ๋•Œ, ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์˜ 90%๋ฅผ ์ƒ˜ํ”Œ๋งํ•˜๊ฒ ๋‹ค.
# n_estimators = 25 : ์˜์‚ฌ๊ฒฐ์ •ํŠธ๋ฆฌ ๋ชจ๋ธ์„ 25๊ฐœ ๋งŒ๋“ค๊ฒ ๋‹ค.


#5.๋ชจ๋ธ ํ›ˆ๋ จ

bagging_model.fit(x_train2,y_train)


#6.๋ชจ๋ธ ์˜ˆ์ธก, ํ‰๊ฐ€
result2 = bagging_model.predict(x_test2)


print(sum(result2 == y_test)/len(y_test))

# ์ •ํ™•๋„ = 100%

 

3. ์•™์ƒ๋ธ” - ๋ถ€์ŠคํŒ… ⇒ ์ •ํ™•๋„ =100

โ˜… kaggle์—์„œ ์šฐ์Šนํ•œ ๋ถ€์ŠคํŒ…(XGboost)

# ์ˆœ์„œ
# 1. ๋ฐ์ดํ„ฐ ๋กœ๋“œ
import pandas as pd

iris = pd.read_csv("d:\\data\\iris2.csv")
iris


# 2. ํ›ˆ๋ จ, ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ๋ถ„๋ฆฌ

x = iris.iloc[:,0:4]
y = iris['Species']

from sklearn.model_selection import train_test_split

x_train,x_test, y_train, y_test = train_test_split(x,y, test_size = 0.1,random_state =1 )

x_train.shape, x_test.shape, y_train.shape, y_test.shape
# ((135, 4), (15, 4), (135,), (15,))

# 3. ์ •๊ทœํ™”
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
scaler.fit(x_train)

x_train2 = scaler.transform(x_train)
x_test2= scaler.transform(x_test)

#4. โ˜…โ˜… ์•™์ƒ๋ธ” ์‚ฌ์šฉ(๋ถ€์ŠคํŒ… ๋ชจ๋ธ ๋งŒ๋“ค๊ธฐ)
from sklearn.ensemble import GradientBoostingClassifier

model3 = GradientBoostingClassifier(n_estimators = 300, random_state = 2)

#5.๋ชจ๋ธ ํ›ˆ๋ จ

model3.fit(x_train2,y_train)


#6.๋ชจ๋ธ ์˜ˆ์ธก, ํ‰๊ฐ€
result3 = model3.predict(x_test2)


print(sum(result3 == y_test)/len(y_test))

# ์ •ํ™•๋„ = 100%

 

 

'Data Analysis > ML,DL' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

R์„ ํ™œ์šฉํ•œ ์‚ฌํšŒ์  ์—ฐ๊ฒฐ๋ง ์‹œ๊ฐํ™”  (0) 2022.08.30
์ธ๊ณต์‹ ๊ฒฝ๋ง ํŒŒ์ด์ฌ์œผ๋กœ ๊ตฌํ˜„ํ•˜๊ธฐ  (0) 2022.08.30
    'Data Analysis/ML,DL' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€
    • R์„ ํ™œ์šฉํ•œ ์‚ฌํšŒ์  ์—ฐ๊ฒฐ๋ง ์‹œ๊ฐํ™”
    • ์ธ๊ณต์‹ ๊ฒฝ๋ง ํŒŒ์ด์ฌ์œผ๋กœ ๊ตฌํ˜„ํ•˜๊ธฐ
    steadilee
    steadilee

    ํ‹ฐ์Šคํ† ๋ฆฌํˆด๋ฐ”