반응형
Numpy → DataFrame
import pandas as pd
from pandas import Series, DataFrame
# 딕셔너리를 DataFrame으로 변경해도된다.
fish_dataFrame = pd.DataFrame(fishs, columns=["fish_len", "fish_wei", "target"])
# 번호 넘버링을 미리하고 싶다면?
print(fish_dataFrame)
Visual Studio Code 사용
(1) numpy, pandas 라이브러리 설치
Visual Studio Code → Terminal
pip install numpy
pip install pandas
(2) mariaDB에 넣기
① DB 생성
- root로 접속
create user 'python'@'%' identified by 'python1234';
GRANT ALL PRIVILEGES ON *.* TO 'python'@'%';
create database pythondb;
- dict 타입으로 insert하고 dict 타입으로 select하는게 좋다 → pymysql 라이브러리(mysql, mariadb)
- DataFrame도 DB에 insert하고 select 가능하다 - SQLAlchemy (ORM : Object Relational Mapping)
- class 타입으로도 insert하고 select하는게 가능하다.
MariaDB 연결하는법
(1) Terminal mariadb SQLAlchemy 설치
(2) import , from 설정
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
import sqlalchemy as db
from data.fish_api import getFishData
from sqlalchemy.orm import sessionmaker
(3) engine (Maria DB 연결)
engine = db.create_engine("mariadb+mariadbconnector://python:python1234@127.0.0.1:3306/pythondb")
(4) insert
def insert():
fishs = getFishData()
fishs.to_sql("fish", engine, index=False, if_exists="replace")
(5) select
Pandas 라이브러리로 insert select하는 법(SQLAlchemy의 engine(Connection) 필요)
Session 객체 만들어서 ORM 사용하는법 (파이썬 클래스로 질의)
이스케이프 특수문자
fish_dao.py
# pip install numpy
# pip install pandas
# python -m pip install
# mariadb 연결하는법 - sqlalchemy
# https://mariadb.com/ko/resources/blog/using-sqlalchemy-with-mariadb-connector-python-part-1/
# pandas 라이브러리로 insert, select 하는법(sqlalchemy의 engine(connection) 필요)
# https://ayoteralab.tistory.com/entry/AT-09-mariadbmysql-connection-with-python-2
# session 객체 만들어서 orm 사용하는법(파이썬 클래스로 질의)
# https://docs.sqlalchemy.org/en/14/orm/tutorial.html#creating-a-session
# 이스케이프 특수문자
# https://freedeveloper.tistory.com/191
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
import sqlalchemy as db
from data.fish_api import getFishData
from sqlalchemy.orm import sessionmaker
# dict 타입으로 insert하고 dict 타입으로 select하는게 가장 편하다. - pymysql 라이브러리(mysql, mariadb)
# DataFrame도 DB에 insert하고 select가능하다 - SQLAlchemy(ORM)
# class 타입으로도 insert하고 select하는게 가능하다
# mariadb://127.0.0.1:3306/pythondb?usernme=python&password=python1234
engine = db.create_engine(
"mariadb+mariadbconnector://python:python1234@127.0.0.1:3306/pythondb")
def insert():
fishs = getFishData()
fishs.to_sql("fish", engine, index=False, if_exists="replace")
def select():
df = pd.read_sql("select * from fish", con=engine)
print(df)
# insert()
select()
fish_api.py
import numpy as np
import pandas as pd
def getFishData():
# 도미
bream_length = [25.4, 26.3, 26.5, 29.0, 29.0, 29.7, 29.7, 30.0, 30.0, 30.7, 31.0, 31.0,
31.5, 32.0, 32.0, 32.0, 33.0, 33.0, 33.5, 33.5, 34.0, 34.0, 34.5, 35.0,
35.0, 35.0, 35.0, 36.0, 36.0, 37.0, 38.5, 38.5, 39.5, 41.0, 41.0]
bream_weight = [242.0, 290.0, 340.0, 363.0, 430.0, 450.0, 500.0, 390.0, 450.0, 500.0, 475.0, 500.0,
500.0, 340.0, 600.0, 600.0, 700.0, 700.0, 610.0, 650.0, 575.0, 685.0, 620.0, 680.0,
700.0, 725.0, 720.0, 714.0, 850.0, 1000.0, 920.0, 955.0, 925.0, 975.0, 950.0]
# 빙어
smelt_length = [9.8, 10.5, 10.6, 11.0, 11.2, 11.3,
11.8, 11.8, 12.0, 12.2, 12.4, 13.0, 14.3, 15.0]
smelt_weight = [6.7, 7.5, 7.0, 9.7, 9.8, 8.7,
10.0, 9.9, 9.8, 12.2, 13.4, 12.2, 19.7, 19.9]
fish_length = bream_length + smelt_length
fish_weight = bream_weight + smelt_weight
fish_target = np.concatenate((np.ones(35), np.zeros(14)))
fishs = np.column_stack((fish_length, fish_weight, fish_target))
# 딕셔너리를 DataFrame으로 변경해도된다.
fish_dataFrame = pd.DataFrame(
fishs, columns=["fish_len", "fish_wei", "target"])
# 번호 넘버링을 미리하고 싶다면?
return fish_dataFrame
모델에 학습 및 시각화(colab)
import numpy as np
# 도미
bream_length = [25.4, 26.3, 26.5, 29.0, 29.0, 29.7, 29.7, 30.0, 30.0, 30.7, 31.0, 31.0,
31.5, 32.0, 32.0, 32.0, 33.0, 33.0, 33.5, 33.5, 34.0, 34.0, 34.5, 35.0,
35.0, 35.0, 35.0, 36.0, 36.0, 37.0, 38.5, 38.5, 39.5, 41.0, 41.0]
bream_weight = [242.0, 290.0, 340.0, 363.0, 430.0, 450.0, 500.0, 390.0, 450.0, 500.0, 475.0, 500.0,
500.0, 340.0, 600.0, 600.0, 700.0, 700.0, 610.0, 650.0, 575.0, 685.0, 620.0, 680.0,
700.0, 725.0, 720.0, 714.0, 850.0, 1000.0, 920.0, 955.0, 925.0, 975.0, 950.0]
# 빙어
smelt_length = [9.8, 10.5, 10.6, 11.0, 11.2, 11.3,
11.8, 11.8, 12.0, 12.2, 12.4, 13.0, 14.3, 15.0]
smelt_weight = [6.7, 7.5, 7.0, 9.7, 9.8, 8.7,
10.0, 9.9, 9.8, 12.2, 13.4, 12.2, 19.7, 19.9]
fish_length = bream_length + smelt_length
fish_weight = bream_weight + smelt_weight
fish_target = np.concatenate((np.ones(35), np.zeros(14)))
fish_data = np.column_stack((fish_length, fish_weight))
print(fish_data)
print(fish_target)
# 랜덤하게 섞기
np.random.seed(42)
index = np.arange(10)
print(index)
np.random.shuffle(index)
print(index)
<연습>
n = np.arange(5)
np.random.shuffle(n)
print(n)
list1 = [3,4,5,10,15]
list2 = [0,0,0,1,1]
data = np.array(list1)
target = np.array(list2)
train_input = data[n[:3]]
print(train_input)
train_target = target[n[:3]]
print(train_target)
test_input = data[n[3:]]
print(test_input)
test_target = target[n[3:]]
print(test_target)
# [3,10,4,15,5]
# [0, 1,0, 1,0]
# 훈련데이터 [3,10,4] [0,1,0]
# 검증데이터 [15,5] [1,0]
<적용>
# 데이터를 셔플없이 훈련데이터와 검증(테스트)데이터로 나누면 무슨 문제?
# 샘플링 편향이 된다. (셔플)
index = np.arange(49) # 35(도미), 14(빙어)
np.random.shuffle(index)
print(index)
# 35개
train_input = fish_data[index[:35]] # 훈련 데이터 (모델)
train_target = fish_target[index[:35]] # 타겟 데이터 (모델)
# 14개
test_input = fish_data[index[35:]] # 훈련 데이터 (검증)
test_target = fish_target[index[35:]] # 타겟 데이터 (검증)
import matplotlib.pyplot as plt
plt.scatter(train_input[:,0], train_input[:,1]) # 훈련
plt.scatter(test_input[:,0], test_input[:,1]) # 검증
plt.xlabel("length")
plt.ylabel("weight")
plt.show()
반응형
'Programming > Python' 카테고리의 다른 글
Python Numpy 7강 - 행렬곱, 연립방정식 (2) | 2021.10.21 |
---|---|
Python Numpy 6강 - BroadCasting (0) | 2021.10.20 |
Python Numpy 4강 - 데이터 합쳐서 시각화 하기 (0) | 2021.10.20 |
Python Numpy 3강 - arange, zeros, ones (0) | 2021.10.20 |
Python Numpy 2강 - Slicing (0) | 2021.10.20 |