'python' 카테고리의 글 목록

python

DQN network 수정 2024.03.12
Colab에서 살아남기 2023.03.07 3
강화학습을 이용한 비트코인 매매프로그램(14) - 학습 파일 단위 분리 2023.02.25 5
강화학습을 이용한 비트코인 매매프로그램(13) - 강화학습 최적화 2023.02.14 1
강화학습을 이용한 비트코인 매매프로그램(12) - 실거래 적용 2023.01.30 5
python - 3 사칙 연산 2023.01.24
Python - 2 2023.01.24
파이썬 기초 1 - 설치 2023.01.24
구글 colab에서 학습하기 2023.01.20
강화학습을 이용한 비트코인 매매프로그램(11) - ResNet + RNN 적용 모델 2022.12.25 1

DQN network 수정

아빠는 벌레잡이 2024. 3. 12. 23:37

2024. 3. 12. 23:37

1.머리말

기존 제 블로그에 작성되어 있던 강화학습을 이용한 자동매매 프로그램에 대한 내용의 DQN network 을
deep minder의 논문 내용처럼 이미지를 분석하여 상태값을 구하는 로직과
그 상태로 액션을 구하는 방식으로 변경하는 내용을 순서되로 기술하고자 합니다.

2.H/W의 변경

OS를 기존의 centos에서 ubuntu로 변경했습니다.

저의 경우 기존 하드디스크를 root에 30%를 할당하고
/home에 50%
/var에 20%를 할당하여 사용중이 었습니다.

그래서 이번에 ubuntu 를 깔때 하드디스크를 삭제하지 않고 root와 var그리고 home을 새로 매핑해주고 나머지는 그대로 밀어버렸습니다.

나머지는 ubuntu 설치 프로그램에게 맡겼습니다.

설치가 완료되고 기존 src가 그대로 살아 있는 것을 확인했습니다.

그런데 문제가 있었습니다.
정확하게는 2개의 문제가 있었습니다.

기존 cmos 체제의 bios가 아닌 ufi체제로 설치가 바뀌면서 네트워크이름이 계속 변경되었습니다.

리부팅할때마다 네트워크가 dhcp로 바뀌는 바람에 집에서 원격접속을 할수가 없었습니다.

처음에 enp4s0에서 시작한 이름이 지금은 enp15s0입니다.

빨리 ubuntu 에서 패치가 되었으면 좋겠네요.
원인은 알지만 그냥 두었습니다.

두번째 문제는 네트워크를 static으로 변경하고 고정 아이피를 부여했지만 네임서버를 못 가져오는 문제가 발생했습니다.

/etc/netplan/*.yam을 수정하는 대신 /etc/systemd/reserve.conf을 수정했습니다.

여기서 주의할 사항은
DNS=아이피주소1 아이피주소2
이런식으로 컴마 없이 기입을 해야 작동한 다는 겁니다.
혹 yamel에 익숙하신 분들은 컴마른 사용하시면 안됩니다.

그리고 중요한 한가지는 gpu보드는 브랜드는 상관없지만
gpu성능은 동일한 모델을 선택해야 합니다.

저는 rtx3060oc 모델 총8장을 실장했습니다.
소스에서는 board 아이디를 0부터 7까지 배열 형식으로 기입을 해야 인식이 가능합니다.

ubuntu 로 os를 변경한 이유는
ubuntu 의 경우 LTS버젼을 설치했을때 기준으로 최신 NVIDIA드라이버를 모두 지원하고 있을 뿐 아니라 CUDA,CUNN버젼도 최신 버젼을 apt repository 에서 지원을 해주기 때문에 굉장히 편리합니다.

rhl계열에서 아직 지원이 안되기 때문에 학습중 서버가 멈춰버립니다.

예전에는 디펑트가 발생해도 reboot이 가능했던것 같은데 지금은 안되네요.

이부분은 ubuntu도 마찬가지인듯 하고요.
하지만 cuda최신버젼을 설치하니 그런 현상이 아직까지는 일어나지 않네요.

문제는 기본 드라이브만 깔고 소스를 돌려도 에러가 나지 않지만 조금 지나 보드2번으로 연산이 옮겨 갈때쯤 서버가 죽어버립니다.
물론 아무런 예외도 만들지 않고 그냥 죽어버리네요.

처음 저는 파이토치를 사용해서 개발을 하는 바람에 지금처럼 tensoflow 가 세상을 지배한 이시점에 모든게 힘든 사항입니다.

하지만 tensoflow 도 karas에 너무 의존적으로 변하고 모델도 수정하기 힘든 점은 저는 문제가 있다고 봅니다.

어째든 지금 LLT모델은 karas버젼의 라이브러리밖에 없어서 추가 개발시 tensoflow 로 넘어가야 하나 고민입니다.

차리리 pytorch 가 tensoflow 내에서 karas와 경쟁하는 구조로 가는게 어떨까 하는 바램입니다.

암튼 중요한 점은 rhl은 버려야 한다는 점입니다.

그리고 cuda-toolkit을 설치해야만 정상 동작합니다.

다음 블로그에 그 다음 내용을 기재하겠습니다.

LIST

'python > 자동매매 프로그램' 카테고리의 다른 글

Colab에서 살아남기 (3)	2023.03.07
강화학습을 이용한 비트코인 매매프로그램(14) - 학습 파일 단위 분리 (5)	2023.02.25
강화학습을 이용한 비트코인 매매프로그램(13) - 강화학습 최적화 (1)	2023.02.14
강화학습을 이용한 비트코인 매매프로그램(12) - 실거래 적용 (5)	2023.01.30
강화학습을 이용한 비트코인 매매프로그램(11) - ResNet + RNN 적용 모델 (1)	2022.12.25

Colab에서 살아남기

아빠는 벌레잡이 2023. 3. 7. 13:01

2023. 3. 7. 13:01

Colab의 기본모드는 참 애매합니다.

안되는 건아니지만 된다고 하기에도 참 뭐하고.

일단 그래서 PRO를 질러 봤습니다.

마찬가지로 참 애매합니다.

런타임 구성을 GPU로 바꿨지만 역시 되는 것도 안되는 것도 아닌

- 참 본론으로 들어가기전에 한국에서는 Colab결재가 안됩니다.

다른 이유 때문은 아니고 우편번호를 입력하라고 나오는데 한국 우편번호는 밴입니다.

그래서 사람들은 애플 주소나 메타 또는 테슬라 주소를 검색해서 우편번호를 가져옵니다.

아마도 우리나라 가입자 70%는 애플 직원일겁니다. -

그래서 PRO+를 질러 봤습니다.

속은 시원합니다.

이제 뭔가 되는 느낌입니다.

하지만 여기서 느낀 점이 있어서 이 글을 쓰게 되었습니다.

(쥬피터)노트북 환경의 특징은 실행하고자 하는 소스 단위를 셀이라는 단위로 나누어 실행이 가능합니다.

즉 덩어리 소스가 아닌 소형의 소스로 만든 다음 나눠서 실행이 가능합니다.

Colab 유료버젼 사용시 주의 사항을 요약하고 전략을 알려 드립니다.

1. 소스를 세세히 나눠라.

# %%
%pip install kaleido
# %%
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('Not using a high-RAM runtime')
else:
  print('You are using a high-RAM runtime!')

# %%
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.init as init
import torchvision
import torch.nn.functional as F

# https://pytorch.org/docs/stable/torchvision/datasets.html
# 파이토치에서는 torchvision.datasets에 MNIST 등의 다양한 데이터를 사용하기 용이하게 정리해놨습니다.
# 이를 사용하면 데이터를 따로 학습에 맞게 정리하거나 하지 않아도 바로 사용이 가능합니다.
import torchvision.datasets as dset

# https://pytorch.org/docs/stable/torchvision/transforms.html?highlight=transforms
# torchvision.transforms에는 이미지 데이터를 자르거나 확대 및 다양하게 변형시키는 함수들이 구현되어 있습니다. 
import torchvision.transforms as transforms

# https://pytorch.org/docs/stable/data.html?highlight=dataloader#torch.utils.data.DataLoader
# DataLoader는 전처리가 끝난 데이터들을 지정한 배치 크기에 맞게 모아서 전달해주는 역할을 합니다.
from torch.utils.data import DataLoader
import pandas as pd

import numpy as np
import matplotlib.pyplot as pltx

import plotly.graph_objects as go
import plotly.subplots as ms
import plotly.express as px
import plotly as plt
import pandas as pd
import numpy as np
import time
from PIL import Image
import io
import os.path as path
import math

# %%
batch_size = 2000
learning_rate = 0.0005
num_epoch = 1000
data_size = 100
action_kind = 4
screen_height = 50
screen_width  = 70
qsize = 512
# %%
# %%
class Account():
    def __init__(self, df, origin) -> None:
        self.BASIC_FEES = 0.0005
        self.hold_score = 0
        self.byu_score = 0
        self.sell_score = 0        
        self.orgin_money = origin
        self.money = origin
        self.balance = origin
        self.unit  = 0
        self.buy_index = 0
        self.max_rate = 0
        self.df = df
        self.rate = 0
        self.old_rate = 0
        self.bak_unit = 0
        self.bak_balance = 0
    
    def reset(self):
        self.balance = self.orgin_money
        self.money = self.orgin_money
        self.old_rate = 0
        self.unit = 0
        self.rate = 0
        self.buy_index = 0
        
    def select(self, action:int): 
        ret_action = 0
        if self.unit > 0 :
            ret_action = 0 if action == 1 else action
        else:
            ret_action = 0 if action == 2 else action
        return ret_action
    
    def exec_action(self, action, idx):
        real_action = self.select(action)
        if real_action == 1:
            return self.unit_buy(idx), real_action
        elif real_action == 2:
            return self.unit_sell(idx), real_action
        else:
            return self.unit_hold(idx), real_action
    
    def unit_buy(self, index):
        self.back_up()
        buy_balance = self.balance
        self.buy_index = index
        self.old_rate = self.rate
        while True:
            buy_unit = buy_balance / self.df.loc[index, 'close']
            amount = (buy_unit * self.df.loc[index, 'close']) * self.BASIC_FEES + buy_unit * self.df.loc[index, 'close']
            if self.balance < amount :
                buy_balance -= self.balance * 0.0001
            else:
                self.balance -= amount
                self.unit = buy_unit
                self.rate = ((self.unit * self.df.loc[index, 'close'] + self.balance) - self.orgin_money) * 100 / self.orgin_money
                # if index%50 == 0 : print("buy index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
                print("buy index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
                break
        return self.rate
        
    def unit_sell(self, index):
        self.back_up()
        self.old_rate = self.rate
        sell_balance = self.unit * self.df.loc[index, 'close'] - (self.unit * self.df.loc[index, 'close']) * self.BASIC_FEES
        self.balance += sell_balance
        self.unit = 0
        self.rate = ((self.unit * self.df.loc[index, 'close'] + self.balance) - self.orgin_money) * 100 / self.orgin_money
        # if index%50 == 0 : print("sell index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
        print("sell index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
        self.money = self.balance
        self.max_rate = 0
        return self.rate
    
    def unit_hold(self, index):
        self.old_rate = self.rate
        self.rate = ((self.unit * self.df.loc[index, 'close'] + self.balance) - self.orgin_money) * 100 / self.orgin_money
        # print("index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
        return self.rate
    
    def get_newaction(self, index):
        if self.unit > 0:
            hold_rate = ((self.unit * self.df.loc[index, "sma3"] + self.balance) - self.orgin_money) * 100 / self.orgin_money
            sell_rate = ((self.unit * self.df.loc[index, "close"] + self.balance) - self.orgin_money) * 100 / self.orgin_money
            return 0 if hold_rate > sell_rate else 2
        else:
            buy_balance = self.balance
            self.buy_index = index
            self.old_rate = self.rate
            while True:
                buy_unit = buy_balance / self.df.loc[index, 'close']
                amount = (buy_unit * self.df.loc[index, 'close']) * self.BASIC_FEES + buy_unit * self.df.loc[index, 'close']
                if self.balance < amount :
                    buy_balance -= self.balance * 0.0001
                else:
                    self.balance -= amount
                    self.unit = buy_unit
                    buy_rate = ((self.unit * self.df.loc[index, 'close'] + self.balance) - self.orgin_money) * 100 / self.orgin_money
                    break
            hold_rate = (self.balance - self.orgin_money) * 100 / self.orgin_money
            return 0 if hold_rate > buy_rate else 1
        
    def back_up(self):
        self.bak_balance = self.balance
        self.bak_unit = self.unit

    def  back_ward(self):
        self.balance = self.bak_balance 
        self.unit = self.bak_unit
                
    def is_bankrupt(self):
        return (self.money < 10000)

# %%
def get_chart(df, idx, max_data:int=300, i_w:int=140, i_h:int=100):
    ndf = df.head(idx - start_idx).tail(max_data)
    ndf.reset_index(drop=True, inplace=True)
    # print(ndf)
    
    candle = go.Candlestick(x=ndf.index,open=ndf['open'],high=ndf['high'],low=ndf['low'],close=ndf['close'], increasing_line_color = 'red',decreasing_line_color = 'blue', showlegend=False)
    upper = go.Scatter(x=ndf.index, y=ndf['upper'], line=dict(color='red', width=2), name='upper', showlegend=False)
    ma20 = go.Scatter(x=ndf.index, y=ndf['ma20'], line=dict(color='black', width=2), name='ma20', showlegend=False)
    lower = go.Scatter(x=ndf.index, y=ndf['lower'], line=dict(color='blue', width=2), name='lower', showlegend=False)

    volume = go.Bar(x=ndf.index, y=ndf['volume'], marker_color='red', name='volume', showlegend=False)

    MACD = go.Scatter(x=ndf.index, y=ndf['macd'], line=dict(color='blue', width=2), name='MACD', legendgroup='group2', legendgrouptitle_text='MACD')
    MACD_Signal = go.Scatter(x=ndf.index, y=ndf['signal'], line=dict(dash='dashdot', color='green', width=2), name='MACD_Signal')
    MACD_Oscil = go.Bar(x=ndf.index, y=ndf['flag'], marker_color='purple', name='MACD_Oscil')

    fast_k = go.Scatter(x=ndf.index, y=ndf['fast_k'], line=dict(color='skyblue', width=2), name='fast_k', legendgroup='group3', legendgrouptitle_text='%K %D')
    slow_d = go.Scatter(x=ndf.index, y=ndf['slow_d'], line=dict(dash='dashdot', color='black', width=2), name='slow_d')

    PB = go.Scatter(x=ndf.index, y=ndf['PB']*100, line=dict(color='blue', width=2), name='PB', legendgroup='group4', legendgrouptitle_text='PB, MFI')
    MFI10 = go.Scatter(x=ndf.index, y=ndf['MFI10'], line=dict(dash='dashdot', color='green', width=2), name='MFI10')

    RSI = go.Scatter(x=ndf.index, y=ndf['rsi14'], line=dict(color='red', width=2), name='RSI', legendgroup='group5', legendgrouptitle_text='RSI')
    
    # 스타일
    fig = ms.make_subplots(rows=5, cols=2, specs=[[{'rowspan':4},{}],[None,{}],[None,{}],[None,{}],[{},{}]], shared_xaxes=True, horizontal_spacing=0.03, vertical_spacing=0.01)

    fig.add_trace(candle,row=1,col=1)
    fig.add_trace(upper,row=1,col=1)
    fig.add_trace(ma20,row=1,col=1)
    fig.add_trace(lower,row=1,col=1)

    fig.add_trace(volume,row=5,col=1)

    fig.add_trace(candle,row=1,col=2)
    fig.add_trace(upper,row=1,col=2)
    fig.add_trace(ma20,row=1,col=2)
    fig.add_trace(lower,row=1,col=2)

    fig.add_trace(MACD,row=2,col=2)
    fig.add_trace(MACD_Signal,row=2,col=2)
    fig.add_trace(MACD_Oscil,row=2,col=2)

    fig.add_trace(fast_k,row=3,col=2)
    fig.add_trace(slow_d,row=3,col=2)

    fig.add_trace(PB,row=4,col=2)
    fig.add_trace(MFI10,row=4,col=2)

    fig.add_trace(RSI,row=5,col=2)

    # 추세추종
    # trend_fol = 0
    # trend_refol = 0
    # for i in ndf.index:
    #     if ndf['PB'][i] > 0.8 and ndf['MFI10'][i] > 80:
    #         trend_fol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='orange', marker_size=20, marker_symbol='triangle-up', opacity=0.7, showlegend=False)
    #         fig.add_trace(trend_fol,row=1,col=1)
    #     elif ndf['PB'][i] < 0.2 and ndf['MFI10'][i] < 20:
    #         trend_fol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='darkblue', marker_size=20, marker_symbol='triangle-down', opacity=0.7, showlegend=False)
    #         fig.add_trace(trend_fol,row=1,col=1)

    # 역추세추종
    # for i in ndf.index:
    #     if ndf['PB'][i] < 0.05 and ndf['IIP21'][i] > 0:
    #         trend_refol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='purple', marker_size=20, marker_symbol='triangle-up', opacity=0.7, showlegend=False)  #보라
    #         fig.add_trace(trend_refol,row=1,col=1)
    #     elif df['PB'][i] > 0.95 and ndf['IIP21'][i] < 0:
    #         trend_refol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='skyblue', marker_size=20, marker_symbol='triangle-down', opacity=0.7, showlegend=False)  #하늘
    #         fig.add_trace(trend_refol,row=1,col=1)    

    # fig.add_trace(trend_fol,row=1,col=1)
    # 추세추총전략을 통해 캔들차트에 표시합니다.

    # fig.add_trace(trend_refol,row=1,col=1)
    # 역추세 전략을 통해 캔들차트에 표시합니다.
    
    # fig.update_layout(autosize=True, xaxis1_rangeslider_visible=False, xaxis2_rangeslider_visible=False, margin=dict(l=50,r=50,t=50,b=50), template='seaborn', title=f'({ticker})의 날짜: ETH [추세추종전략:오↑파↓] [역추세전략:보↑하↓]')
    # fig.update_xaxes(tickformat='%y년%m월%d일', zeroline=True, zerolinewidth=1, zerolinecolor='black', showgrid=True, gridwidth=2, gridcolor='lightgray', showline=True,linewidth=2, linecolor='black', mirror=True)
    # fig.update_yaxes(tickformat=',d', zeroline=True, zerolinewidth=1, zerolinecolor='black', showgrid=True, gridwidth=2, gridcolor='lightgray',showline=True,linewidth=2, linecolor='black', mirror=True)
    # fig.update_traces(xhoverformat='%y년%m월%d일')
    # size = len(img)
    # img = plt.io.to_image(fig, format='png')
    # canvas = FigureCanvasBase(fig)
    # img = Image.frombytes(mode='RGB', size=(700, 500), data=fig.to_image().to, decoder_name='raw')
    # img = Image.frombytes('RGBA', (700, 500), plt.io.to_image(fig, format='png'), 'raw')
    # img = Image.fromarray('RGBA', (700, 500), np.array(plt.io.to_image(fig, format='png')), 'raw')
    # img = Image.fromarray(np.array(plt.io.to_image(fig, format='png')), 'RGB')
    # img = Image.fromarray(np.array(plt.io.to_image(fig, format='png')), 'L')
    # img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png', width=140, height=100)))
    # img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png', width=700, height=500)))
    img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png')))
    # print(img)
    img.convert("RGB")
    # img.show()
    # print(img)
    img.thumbnail((i_h, i_w), Image.LANCZOS)
    # print(img)
    # img = Image.Image(img, 'RGB')
    # img.show()
    return img

# %%
# %%
class DQN(nn.Module):
    def __init__(self, h, w, outputs, qsize):
        super(DQN, self).__init__()
        self.conv1 = nn.Sequential(
            nn.Conv2d(4, h*w, kernel_size=5, stride=5),
            nn.BatchNorm2d(h*w),
            nn.ReLU()
        )
        self.conv2 = nn.Sequential(
            nn.MaxPool2d(kernel_size=2,stride=2),
            nn.Conv2d(h*w, 4 * qsize, kernel_size=(3,5), stride=5),
            nn.BatchNorm2d(4 * qsize),
            nn.ReLU()
        )
        self.conv3 = nn.Sequential(
            nn.Conv2d(4 * qsize, qsize, kernel_size=1, stride=5),
            nn.BatchNorm2d(qsize),
            nn.ReLU()
        )
        self.head = nn.Sequential(
            nn.Dropout(0.25),
            # nn.Linear(qsize, qsize*2),
            # nn.ReLU(),
            # nn.Linear(qsize*2, qsize*2),
            # nn.LeakyReLU(),
            # nn.Linear(qsize*2, qsize*4),
            # nn.LeakyReLU(),
            # nn.Linear(qsize*4, qsize*8),
            # nn.LeakyReLU(),
            # nn.Dropout(0.25),
            # nn.Linear(qsize*8, qsize*4),
            # nn.LeakyReLU(),
            # nn.Linear(qsize*4, qsize),
            # nn.LeakyReLU(),
            nn.Linear(qsize, outputs),
            # nn.ReLU(),
            # nn.Softmax(dim=1)
        )

    # Called with either one element to determine next action, or a batch
    # during optimization. Returns tensor([[left0exp,right0exp]...]).
    def forward(self, x):
        x = self.conv1(x)
        # print("x1------->", x.size())
        # x = F.dropout2d(x)
        x = self.conv2(x)
        # print("x2------->", x.size())
        # x = F.dropout2d(x)
        x = self.conv3(x)
        # print("x3------->", x.size())
        x = x.view(x.size(0), -1)
        # print("x6------->", x.size())
        x = self.head(x)
        # print("x7------->", x.size())
        return  x
# %%
from google.colab import drive
import os
drive.mount('/content/drive')
os.chdir("/content/drive/MyDrive/dqn3_test")

# %%
def select_action(df, idx):
    action = ((df.loc[idx, "close"] - df.loc[idx, "closemin"]) * (action_kind-1) // (df.loc[idx, "closemax"] - df.loc[idx, "closemin"]))
    action = 1 if action == 0 else ((action_kind - 2) if action == (action_kind - 1) else action)
    action = 0 if df.loc[idx, "low"] <= df.loc[idx, "closemin"] else action
    action = (action_kind-1) if df.loc[idx, "high"] >= df.loc[idx, "closemax"] else action
    # action = action if action <=0 else 1
    # action = action if action >= 2 else 1
    action = int(action)
    # print("idx{0:d} action[{1}] deg[{2}] close:{3} closemin:{4} closemax:{5}".format(idx, action, df.loc[idx, "sma20deg"], df.loc[idx, "close"], df.loc[idx, "closemin"], df.loc[idx, "closemax"]))
    return action

# %%
df = pd.read_csv("data_all.dat")
start_idx = df.index[0]
print("start_idx[{}]".format(start_idx))

# %%
converter = torchvision.transforms.ToTensor()

# %%
# gpu가 사용 가능한 경우에는 device를 gpu로 설정하고 불가능하면 cpu로 설정합니다.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# %%
# 모델을 지정한 장치로 올립니다.
model = DQN(screen_height, screen_width, action_kind, qsize=qsize).to(device)
if path.exists("train_dqn_{0:02d}_{1}_{2:04d}.pt".format(action_kind, device, qsize)):
    model.load_state_dict(torch.load("train_dqn_{0:02d}_{1}_{2:04d}.pt".format(action_kind,device,qsize)))
model.eval()
loss_func = nn.CrossEntropyLoss()

# 최적화함수로는 Adam을 사용합니다.
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
optimizer.zero_grad()
# scheduler
scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer=optimizer, mode='min')

# %%
charts = []
charts = []
# files = os.listdir(".")
# for filename in files :
for i in range(39):
    filename = "chart_all_{0:04d}.dmp".format(i)
    if filename.endswith(".dmp"):
        for j in range(10):
            tmp = torch.load(filename)
            for item in tmp:
                charts.append(item)
print("charts:", len(charts))

#%%
stat = None
if stat == None:
    stat = model.state_dict()
train_loader = DataLoader(charts,batch_size=batch_size, shuffle=True,num_workers=2,drop_last=True)
test_loader = DataLoader(charts,batch_size=batch_size, shuffle=False,num_workers=2,drop_last=True)
loss_arr =[]
model.load_state_dict(stat)
model.train()
loss_func.train()
lr_step = 0
for i in range(num_epoch):
    oldloss = None
    for j,[image,label] in enumerate(train_loader):
        x = image.to(device).squeeze(1)
        y = label
        y = y.type(torch.LongTensor).to(device)
        # print("j[{0}]".format(j), x , y)
        output = model.forward(x)
        # print("output:", output)
        loss = loss_func(output,y)
        if oldloss == None:
            oldloss = loss.cpu().detach().numpy()
        loss.backward()
        # optimizer.zero_grad()
        optimizer.step()

        if j % 300 == 0:
            print(loss.cpu().detach().numpy())
            loss_arr.append(loss.cpu().detach().numpy())
            stat = model.state_dict()
            torch.save(stat,"train_dqn_{0:02d}_{1}_{2:04d}.pt".format(action_kind, device, qsize))
    stat = model.state_dict()
    torch.save(stat,"train_dqn_{0:02d}_{1}_{2:04d}.pt".format(action_kind, device, qsize))
    lr_step = oldloss - loss.cpu().detach().numpy()
    scheduler.step(lr_step)
pltx.plot(loss_arr)
pltx.show()

#%%
# 맞은 개수, 전체 개수를 저장할 변수를 지정합니다.
correct = 0
total = 0.000000000001

model.eval()
loss_func.eval()
# 인퍼런스 모드를 위해 no_grad 해줍니다.
with torch.no_grad():
# 테스트로더에서 이미지와 정답을 불러옵니다.
    for image,label in test_loader:
        # 두 데이터 모두 장치에 올립니다.
        x = image.to(device).squeeze(1)
        y = label.to(device)

        # 모델에 데이터를 넣고 결과값을 얻습니다.
        output = model.forward(x)

        # https://pytorch.org/docs/stable/torch.html?highlight=max#torch.max
        # torch.max를 이용해 최대 값 및 최대값 인덱스를 뽑아냅니다.
        # 여기서는 최대값은 필요없기 때문에 인덱스만 사용합니다.
        _,output_index = torch.max(output,1)

        # 전체 개수는 라벨의 개수로 더해줍니다.
        # 전체 개수를 알고 있음에도 이렇게 하는 이유는 batch_size, drop_last의 영향으로 몇몇 데이터가 잘릴수도 있기 때문입니다.
        total += label.size(0)

        # 모델의 결과의 최대값 인덱스와 라벨이 일치하는 개수를 correct에 더해줍니다.
        correct += (output_index == y).sum().float()

# 테스트 데이터 전체에 대해 위의 작업을 시행한 후 정확도를 구해줍니다.
print("Accuracy of Test Data: {}%".format(100*correct/total))

#@title
model.eval()
account = Account(df, 50000000)
account.reset()

with torch.no_grad():
    tot = 0.0000000000001
    corr = 0.0
    predict_arr = []
    for idy in df.index:
        x = get_chart(df, idy, data_size, i_w = screen_width, i_h = screen_height)
        x = converter(x).unsqueeze(0).to(device).squeeze(1)

        output = model.forward(x)

        _,action = torch.max(output,1)
        # print("action:", action)
        action = action.cpu().numpy()[0]
        # print("action:", action)
        predict_arr[action] += 1
        sel_act = select_action(df, idy)
        tot += 1 if sel_act in [0, action_kind-1] or action in [0, action_kind-1] else 0
        corr += 1 if (sel_act in [0, action_kind-1] or action in [0, action_kind-1]) and sel_act == action else 0
        reward, real_action = account.exec_action(2 if action == (action_kind - 1) else (1 if action == 0 else 0) , idy)
        if idy % 100 == 0:
            print("progress {0:.2f}".format(idy * 100 / df.index.max()))
            print("idy:%d action[%d:%d] price [%.4f] unit[%.4f] agent rate:%.05f remind money:%.02f accuracy:%.02f" 
                % (idy, sel_act, action, df.loc[idy, 'close'], account.unit, account.rate, account.balance + account.unit * df.loc[idy, 'close'], 100*corr/tot))
        print("prediction count")
        for act in predict_arr:
            print("action{0:02d}  ==>  {1:d}".format(act, predict_arr[act]))

Colab을 사용하기 위해서는 구글 라이브러리 실행부가 있어야 합니다.

그런 부분을 될 수 있으면 위로 올립니다.

선언부(import)가 있는 부분 중 소스에 필수 적인것은 한곳에 모으고 구글 라이브러리에서만 사용하는 것은 구글 라이브러리가 있는 곳에 모읍니다.

2. GPU사용부와 비GPU부분을 확실히 나눈다.

GPU사용부는 확습과 테스트 두곳 뿐이다.

모델 선언부도 당연히 아니므로 그 부분도 분리 한다.

3. 일반 메모리를 사용하는 데이터 처리부분은 데이터 사이즈를 몇KB단위로 나누어 torch.save함수를 이용하여 파일로 저장한다.

마치 알집을 번호 붙여서 나누는 것 처럼.

그리고 사용할 때 한 배열에 불러서 사용한다.

학습용 데이터는 제각각에 생각보다 시간이 많이 들고 데이터 사이즈가 어느 이상 크지면 구글의 컴퓨팅 단위는 순식간에 0이 되어 버립니다.

PRO+를 구매해도 하루면 땡이죠.

그리고는 경고는 떠지만 추가 구입할 방법이 없습니다.

(완전히 0이 되면 비구독으로 구매가 가능합니다.)

즉 메모리 사용량이 넘치면 프로세스가 느려지면서 돈이 공중으로 금방 사라집니다.

한달 정도는 쓸거라는 생각은 한 순간에 희망사항이 됩니다.

그래서 아주 작은 단위로 데이터를 메모리로 로딩한 다음 torch.save 와 torch.load를 활용해서 파일로 저장하고

번호를 붙입니다.

메모리에 올리는 것도 반대로 합니다.

작은 파일을 만드는 작은 로컬 컴에서 해야합니다.

이 작업은 구글 Colab에서 하다가는 돈 7만원은 금방 날라 갑니다.

로컬 피씨에서 해서 구글 드라이브에 올리면 됩니다.

4. 학습을 제외한 나머지는 로컬에서 실행합니다.

그렇게 할려면 최대한 셀을 나누어야 합니다.

로컬에도 GPU환경이 있다면 모델 수정에만 Colab을 사용합니다.

이 비용도 상당합니다.

5. tip으로 어느 정도 loss가 줄면 학습 부분만 계속 해서 실행하면서 계속 loss를 내릴 수 있습니다.

전체 소스를 다시 로딩한다면 상당한 시간이 걸릴 수 있지만

중간에 중간 cell 단위로 프로그램을 실행할 수 있으므로 leanring rate를 줄이고 실행한

다음 학습률을 실행하면서 훨씬 빠르게 학습이 가능 합니다.

그리고 모델을 저장하는 부분도 cell을 분리 한다면

어느 정도 학습이 이루어 지고 모델을 저장하는 부분을 따로 실행하여 모델을 임시 저장한 다음

다시 학습을 하고 또 테스트를 진행한 다음 다시 학습을 할 수 도 있습니다.

결론 : Colab PRO와 PRO+는 생각보다 비쌉니다.

이유는 단위가 컴퓨팅 단위로 고용량 메모리나 프리미엄 GPU를 사용하려면 그 만큼의 비용이 추가로 듭니다.

그러므로 데이터와 비GPU영역의 작업을 최대한 효과적으로 해서 PRO버젼을 사용시 메모리로 데이타를 옮기는 작업을 최소화해야 합니다.

그렇지 않으면

"한시간 후에 구매용량이 부족하여 작업이 종료되며 종료시 런타임이 재작동하여 기존 작업을 잃을 수 있다"

라는 개쓰레기 같은 팝업과 결재가 불가능하여 돈을 추가로 내려고 해도 낼 수 없다는 신기한 경험을 하게 됩니다.

아니 내가 돈을 내겠다는데 안된다고 ?

LIST

저작자표시 비영리

'python > 자동매매 프로그램' 카테고리의 다른 글

DQN network 수정 (0)	2024.03.12
강화학습을 이용한 비트코인 매매프로그램(14) - 학습 파일 단위 분리 (5)	2023.02.25
강화학습을 이용한 비트코인 매매프로그램(13) - 강화학습 최적화 (1)	2023.02.14
강화학습을 이용한 비트코인 매매프로그램(12) - 실거래 적용 (5)	2023.01.30
강화학습을 이용한 비트코인 매매프로그램(11) - ResNet + RNN 적용 모델 (1)	2022.12.25

강화학습을 이용한 비트코인 매매프로그램(14) - 학습 파일 단위 분리

아빠는 벌레잡이 2023. 2. 25. 17:55

2023. 2. 25. 17:55

전체 학습을 한 파일로 돌리면 많은 경우 5일 이상의 시간이 걸립니다.

이런 부분을 Joblib lib를 이용하여 오래 걸리는 부분을 작은 단위로 분리하여 학습을 작은 단위별로 진행할 수 있게 분리했습니다.

다시 해야할 부분만 진행하여 학습을 강화하다면 효율적일 것 같습니다.

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.init as init
import torchvision
import torch.nn.functional as F

# https://pytorch.org/docs/stable/torchvision/datasets.html
# 파이토치에서는 torchvision.datasets에 MNIST 등의 다양한 데이터를 사용하기 용이하게 정리해놨습니다.
# 이를 사용하면 데이터를 따로 학습에 맞게 정리하거나 하지 않아도 바로 사용이 가능합니다.
import torchvision.datasets as dset

# https://pytorch.org/docs/stable/torchvision/transforms.html?highlight=transforms
# torchvision.transforms에는 이미지 데이터를 자르거나 확대 및 다양하게 변형시키는 함수들이 구현되어 있습니다. 
import torchvision.transforms as transforms

# https://pytorch.org/docs/stable/data.html?highlight=dataloader#torch.utils.data.DataLoader
# DataLoader는 전처리가 끝난 데이터들을 지정한 배치 크기에 맞게 모아서 전달해주는 역할을 합니다.
from torch.utils.data import DataLoader
import pandas as pd

import numpy as np
import matplotlib.pyplot as pltx

import plotly.graph_objects as go
import plotly.subplots as ms
import plotly.express as px
import plotly as plt
import pymysql
import pandas as pd
import numpy as np
import time
import talib
from PIL import Image
import io
import os.path as path
import joblib
import common.constrant as Const
import math

ticker = 'ETH'
class Market():
    def __init__(self) -> None:
        self.df = pd.DataFrame()
    
    def add_bbands(self):
        try:
            print("볼리져 밴드 구하기:%s"  % time.ctime(time.time()))
            self.df['ma20'] = talib.SMA(np.asarray(self.df['close']), 20)
            self.df['stddev'] = self.df['close'].rolling(window=20).std() # 20일 이동표준편차
            upper, middle, lower = talib.BBANDS(np.asarray(self.df['close']), timeperiod=40, nbdevup=2.3, nbdevdn=2.3, matype=0)
            self.df['lower']=lower
            self.df['middle']=middle
            self.df['upper']=upper
            # self.df['bbands_sub']=self.df.upper - self.df.lower
            # self.df['bbands_submax'] =  self.df.bbands_sub.rolling(window=840, min_periods=10).max()
            # self.df['bbands_submin'] =  self.df.bbands_sub.rolling(window=840, min_periods=10).min()
            # self.df['bbs_deg'] = (self.df.mavg - self.df.mavg.shift()).apply(lambda x: math.degrees(math.atan2(x, 40))) 

            # self.df['pct8']=(self.df.close - self.df.lower)/(self.df.upper - self.df.lower)
            # self.df['pct8vsclose']=self.df.close - self.df.pct8
        except Exception as ex:
            print("`add_bbands -> exception! %s `" % str(ex))

    def add_relative_strength(self):
        try:
            print("상대 강도 지수 구하기:%s"  % time.ctime(time.time()))
            rsi14 = talib.RSI(np.asarray(self.df['close']), 14)
            self.df['rsi14'] = rsi14
        except Exception as ex:
            print("`add_relative_strength -> exception! %s `" % str(ex))
            
    def add_iip(self):
        try:
            #역추세전략을 위한 IIP계산
            self.df['II'] = (2*self.df['close']-self.df['high']-self.df['low'])/(self.df['high']-self.df['low'])*self.df['volume']
            self.df['IIP21'] = self.df['II'].rolling(window=21).sum()/self.df['volume'].rolling(window=21).sum()*100            
        except Exception as ex:
            print("`add_iip -> exception! %s `" % str(ex))
            
    def add_stock_cast(self):
        try:
            # 스토캐스틱 구하기
            self.df['ndays_high'] = self.df['high'].rolling(window=14, min_periods=1).max()    # 14일 중 최고가
            self.df['ndays_low'] = self.df['low'].rolling(window=14, min_periods=1).min()      # 14일 중 최저가
            self.df['fast_k'] = (self.df['close'] - self.df['ndays_low']) / (self.df['ndays_high'] - self.df['ndays_low']) * 100  # Fast %K 구하기
            self.df['slow_d'] = self.df['fast_k'].rolling(window=3).mean()    # Slow %D 구하기        except Exception as ex:
        except Exception as ex:
            print("`add_stock_cast -> exception! %s `" % str(ex))
    
    def add_mfi(self):
        try:
            # MFI 구하기
            self.df['PB'] = (self.df['close'] - self.df['lower']) / (self.df['upper'] - self.df['lower'])
            self.df['TP'] = (self.df['high'] + self.df['low'] + self.df['close']) / 3
            self.df['PMF'] = 0
            self.df['NMF'] = 0
            for i in range(len(self.df.close)-1):
                if self.df.TP.values[i] < self.df.TP.values[i+1]:
                    self.df.PMF.values[i+1] = self.df.TP.values[i+1] * self.df.volume.values[i+1]
                    self.df.NMF.values[i+1] = 0
                else:
                    self.df.NMF.values[i+1] = self.df.TP.values[i+1] * self.df.volume.values[i+1]
                    self.df.PMF.values[i+1] = 0
            self.df['MFR'] = (self.df.PMF.rolling(window=10).sum() / self.df.NMF.rolling(window=10).sum())
            self.df['MFI10'] = 100 - 100 / (1 + self.df['MFR'])    
        except Exception as ex:
            print("`add_mfi -> exception! %s `" % str(ex))

    def add_macd(self, sort, long, sig):
        try:
            print("MACD 구하기:%s, sort:%d, long:%d, sig:%d"  % (time.ctime(time.time()), sort, long, sig))
            macd, macdsignal, macdhist = talib.MACD(np.asarray(self.df['close']), sort, long, sig) 
            self.df['macd'] = macd
            # self.df['macdmax'] = self.df.macd.rolling(window=840, min_periods=100).max()
            # self.df['macdmin'] = self.df.macd.rolling(window=840, min_periods=100).min()
            # self.df['macd_max_rate'] = self.df.apply(lambda x: (x['macdmax'] - x['macd']) * 100 / (x['macdmax'] - x['macdmin']), axis=1)
            # self.df['macd_min_rate'] = self.df.apply(lambda x: (x['macd'] - x['macdmin']) * 100 / (x['macdmax'] - x['macdmin']), axis=1)
            # self.df['prv_macd_degrees'] = self.df.macd_degrees.shift()
            self.df['signal'] = macdsignal
            self.df['flag'] = macdhist
            # self.df['prv_osc_degrees'] = self.df.osc_degrees.shift()
        except Exception as ex:
            print("`add_macd -> exception! %s `" % str(ex))

    def get_data(self):
        table_name="TB_ETH_TRADE"
        sqlcon = pymysql.connect(host='192.168.255.59', user='yuxxxx', password='------', db='eth')
        cursor = sqlcon.cursor(pymysql.cursors.DictCursor)
        str_query = """select 
            'time'
            ,`start` as open
            , high
            , low
            , close
            , volume
            , macd
            , macdmax
            , macdmin
            , `signal`
            , `flag`
            , osc_degrees
            , sma1200
            , sma1200_degrees
            , wma1200
            , wma1200_degrees
            from %s""" % table_name
        cursor.execute(str_query)
        data = cursor.fetchall()
        self.df = pd.DataFrame(data)
        sqlcon.close()
        self.df['sma3'] = talib.SMA(np.asarray(self.df['close']), 3)
        self.df['sma20'] = talib.SMA(np.asarray(self.df['close']), 20)
        self.df['sma20deg'] = (self.df.sma20 - self.df.sma20.shift()).apply(lambda x: math.degrees(math.atan2(x, Const.DEGREES_X))) 
        self.df['closemin'] = self.df.apply(lambda x: x['close'] if x['sma20deg'] >= 0 else x['high'], axis = 1).rolling(window=100, center=True).min() 
        self.df['closemax'] = self.df.apply(lambda x: x['close'] if x['sma20deg'] <= 0 else 0, axis = 1).rolling(window=100, center=True).max()
        self.add_macd(12,26,9)
        self.add_bbands()
        self.add_relative_strength()
        self.add_stock_cast()
        self.add_iip()
        self.add_mfi()
        # self.df.dropna(inplace=True)
        self.df.reset_index(drop=True,inplace=True)
        return self.df


batch_size = 200
learning_rate = 0.001
num_epoch = 100
data_size = 100
action_kind = 5
screen_height = 50
screen_width  = 70
qsize = 2048

mk = Market()
df = mk.get_data()
df = df.tail((df.index.max()//batch_size)*batch_size)

df.to_csv("data_all.dat")

이 부분은 데이터를 읽어 드리는 부분인데 이미 DB에 저장된 데이터를 읽어 오므로 1~2분 정도면 끝납니다. 그래서 데이터를 data_all.dat파일에 저장합니다.

사실 이 파일만 있으면 모든게 해결되는 거죠.

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.init as init
import torchvision
import torch.nn.functional as F

# https://pytorch.org/docs/stable/torchvision/datasets.html
# 파이토치에서는 torchvision.datasets에 MNIST 등의 다양한 데이터를 사용하기 용이하게 정리해놨습니다.
# 이를 사용하면 데이터를 따로 학습에 맞게 정리하거나 하지 않아도 바로 사용이 가능합니다.
import torchvision.datasets as dset

# https://pytorch.org/docs/stable/torchvision/transforms.html?highlight=transforms
# torchvision.transforms에는 이미지 데이터를 자르거나 확대 및 다양하게 변형시키는 함수들이 구현되어 있습니다. 
import torchvision.transforms as transforms

# https://pytorch.org/docs/stable/data.html?highlight=dataloader#torch.utils.data.DataLoader
# DataLoader는 전처리가 끝난 데이터들을 지정한 배치 크기에 맞게 모아서 전달해주는 역할을 합니다.
from torch.utils.data import DataLoader
import pandas as pd

import numpy as np
import matplotlib.pyplot as pltx

import plotly.graph_objects as go
import plotly.subplots as ms
import plotly.express as px
import plotly as plt
import pymysql
import pandas as pd
import numpy as np
import time
import talib
from PIL import Image
import io
import os.path as path
import joblib
# import common.constrant as Const

batch_size = 200
learning_rate = 0.001
num_epoch = 100
data_size = 100
action_kind = 4
screen_height = 50
screen_width  = 70
qsize = 2048

df = pd.read_csv("data_all.dat")
df = df.tail(100000)
start_idx = df.index[0]
print("start_idx[{}]".format(start_idx))

def get_chart(df, idx, max_data:int=300, i_w:int=140, i_h:int=100):
    ndf = df.head(idx - start_idx).tail(max_data)
    ndf.reset_index(drop=True, inplace=True)
    print(ndf)
    
    candle = go.Candlestick(x=ndf.index,open=ndf['open'],high=ndf['high'],low=ndf['low'],close=ndf['close'], increasing_line_color = 'red',decreasing_line_color = 'blue', showlegend=False)
    upper = go.Scatter(x=ndf.index, y=ndf['upper'], line=dict(color='red', width=2), name='upper', showlegend=False)
    ma20 = go.Scatter(x=ndf.index, y=ndf['ma20'], line=dict(color='black', width=2), name='ma20', showlegend=False)
    lower = go.Scatter(x=ndf.index, y=ndf['lower'], line=dict(color='blue', width=2), name='lower', showlegend=False)

    volume = go.Bar(x=ndf.index, y=ndf['volume'], marker_color='red', name='volume', showlegend=False)

    MACD = go.Scatter(x=ndf.index, y=ndf['macd'], line=dict(color='blue', width=2), name='MACD', legendgroup='group2', legendgrouptitle_text='MACD')
    MACD_Signal = go.Scatter(x=ndf.index, y=ndf['signal'], line=dict(dash='dashdot', color='green', width=2), name='MACD_Signal')
    MACD_Oscil = go.Bar(x=ndf.index, y=ndf['flag'], marker_color='purple', name='MACD_Oscil')

    fast_k = go.Scatter(x=ndf.index, y=ndf['fast_k'], line=dict(color='skyblue', width=2), name='fast_k', legendgroup='group3', legendgrouptitle_text='%K %D')
    slow_d = go.Scatter(x=ndf.index, y=ndf['slow_d'], line=dict(dash='dashdot', color='black', width=2), name='slow_d')

    PB = go.Scatter(x=ndf.index, y=ndf['PB']*100, line=dict(color='blue', width=2), name='PB', legendgroup='group4', legendgrouptitle_text='PB, MFI')
    MFI10 = go.Scatter(x=ndf.index, y=ndf['MFI10'], line=dict(dash='dashdot', color='green', width=2), name='MFI10')

    RSI = go.Scatter(x=ndf.index, y=ndf['rsi14'], line=dict(color='red', width=2), name='RSI', legendgroup='group5', legendgrouptitle_text='RSI')
    
    # 스타일
    fig = ms.make_subplots(rows=5, cols=2, specs=[[{'rowspan':4},{}],[None,{}],[None,{}],[None,{}],[{},{}]], shared_xaxes=True, horizontal_spacing=0.03, vertical_spacing=0.01)

    fig.add_trace(candle,row=1,col=1)
    fig.add_trace(upper,row=1,col=1)
    fig.add_trace(ma20,row=1,col=1)
    fig.add_trace(lower,row=1,col=1)

    fig.add_trace(volume,row=5,col=1)

    fig.add_trace(candle,row=1,col=2)
    fig.add_trace(upper,row=1,col=2)
    fig.add_trace(ma20,row=1,col=2)
    fig.add_trace(lower,row=1,col=2)

    fig.add_trace(MACD,row=2,col=2)
    fig.add_trace(MACD_Signal,row=2,col=2)
    fig.add_trace(MACD_Oscil,row=2,col=2)

    fig.add_trace(fast_k,row=3,col=2)
    fig.add_trace(slow_d,row=3,col=2)

    fig.add_trace(PB,row=4,col=2)
    fig.add_trace(MFI10,row=4,col=2)

    fig.add_trace(RSI,row=5,col=2)

    # 추세추종
    # trend_fol = 0
    # trend_refol = 0
    # for i in ndf.index:
    #     if ndf['PB'][i] > 0.8 and ndf['MFI10'][i] > 80:
    #         trend_fol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='orange', marker_size=20, marker_symbol='triangle-up', opacity=0.7, showlegend=False)
    #         fig.add_trace(trend_fol,row=1,col=1)
    #     elif ndf['PB'][i] < 0.2 and ndf['MFI10'][i] < 20:
    #         trend_fol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='darkblue', marker_size=20, marker_symbol='triangle-down', opacity=0.7, showlegend=False)
    #         fig.add_trace(trend_fol,row=1,col=1)

    # 역추세추종
    # for i in ndf.index:
    #     if ndf['PB'][i] < 0.05 and ndf['IIP21'][i] > 0:
    #         trend_refol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='purple', marker_size=20, marker_symbol='triangle-up', opacity=0.7, showlegend=False)  #보라
    #         fig.add_trace(trend_refol,row=1,col=1)
    #     elif df['PB'][i] > 0.95 and ndf['IIP21'][i] < 0:
    #         trend_refol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='skyblue', marker_size=20, marker_symbol='triangle-down', opacity=0.7, showlegend=False)  #하늘
    #         fig.add_trace(trend_refol,row=1,col=1)    

    # fig.add_trace(trend_fol,row=1,col=1)
    # 추세추총전략을 통해 캔들차트에 표시합니다.

    # fig.add_trace(trend_refol,row=1,col=1)
    # 역추세 전략을 통해 캔들차트에 표시합니다.
    
    # fig.update_layout(autosize=True, xaxis1_rangeslider_visible=False, xaxis2_rangeslider_visible=False, margin=dict(l=50,r=50,t=50,b=50), template='seaborn', title=f'({ticker})의 날짜: ETH [추세추종전략:오↑파↓] [역추세전략:보↑하↓]')
    # fig.update_xaxes(tickformat='%y년%m월%d일', zeroline=True, zerolinewidth=1, zerolinecolor='black', showgrid=True, gridwidth=2, gridcolor='lightgray', showline=True,linewidth=2, linecolor='black', mirror=True)
    # fig.update_yaxes(tickformat=',d', zeroline=True, zerolinewidth=1, zerolinecolor='black', showgrid=True, gridwidth=2, gridcolor='lightgray',showline=True,linewidth=2, linecolor='black', mirror=True)
    # fig.update_traces(xhoverformat='%y년%m월%d일')
    # size = len(img)
    # img = plt.io.to_image(fig, format='png')
    # canvas = FigureCanvasBase(fig)
    # img = Image.frombytes(mode='RGB', size=(700, 500), data=fig.to_image().to, decoder_name='raw')
    # img = Image.frombytes('RGBA', (700, 500), plt.io.to_image(fig, format='png'), 'raw')
    # img = Image.fromarray('RGBA', (700, 500), np.array(plt.io.to_image(fig, format='png')), 'raw')
    # img = Image.fromarray(np.array(plt.io.to_image(fig, format='png')), 'RGB')
    # img = Image.fromarray(np.array(plt.io.to_image(fig, format='png')), 'L')
    # img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png', width=140, height=100)))
    # img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png', width=700, height=500)))
    img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png')))
    # print(img)
    img.convert("RGB")
    img.show()
    # print(img)
    img.thumbnail((i_h, i_w), Image.LANCZOS)
    # img.show()
    # print(img)
    # img = Image.Image(img, 'RGB')
    return img

def select_action(df, idx):
    print("idx{0:d} close:{1} closemin:{2} closemax:{3}".format(idx, df.loc[idx, "close"], df.loc[idx, "closemin"], df.loc[idx, "closemax"]))
    if df.loc[idx, "sma20deg"] >= 0 :
        action = 0 if df.loc[idx, "close"] <= df.loc[idx, "closemin"] else ((action_kind-1) if df.loc[idx, "close"] >= df.loc[idx, "closemax"] else 1)
    else:
        action = 0 if df.loc[idx, "close"] <= df.loc[idx, "closemin"] else ((action_kind-1) if df.loc[idx, "close"] >= df.loc[idx, "closemax"] else 2)
    return action

converter = torchvision.transforms.ToTensor()
charts = []
for idx in df.index:
    if idx < (start_idx + data_size):
        continue
    else:
        if idx % 50 == 0 :
            print("progress {0:.02f}".format(idx*100/df.index.max()))
        print("idx[{0:06d}]".format(idx))
        img = get_chart(df, idx, data_size, i_w = screen_width, i_h = screen_height)
        act = select_action(df, idx)
        print("act:{}".format(act))
        # act = 0 if  act < 0 else act
        # print("act:{}".format(act))
        img = converter(img).unsqueeze(0)
        # act = torch.tensor(act, dtype=torch.int32)
        act = torch.tensor(act, dtype=torch.float)
        charts.append([img,act])

joblib.dump(charts, "charts_{0:02d}_1.dmp".format(action_kind))

이 부분은 실제로 시간이 제일 오래 걸리는 부분입니다.

Jupyter Notebook이나 Vs code에 Notebook이 설치되어 있다면 Notebook으로 실행할 것을 권합니다.

실제 차트가 정상적으로 만들어 지고 있는지 확인할 필요가 있습니다.

그러나 Vs code의 Notebook의 경우 이미지 버퍼의 버그가 있어서 차트 50장 정도를 출력하고 나면 메모리 오류로 프로그램이 죽습니다.

실제 파일을 만들때는 아래 img.show()함수를 주석처리 하고 output출력도 최소화 하여 파일을 만들 필요가 있습니다.

joblib.dump는 torch.save와 같은 lib를 사용하여 시리얼라이즈를 하고 있습니다.

다만 joblib의 경우 python의 모든 변수를 저장하기는 하지만 한가지 값을 저장하는 용도라면 torch.save하는 것이 좋습니다.

torch.save의 경우는 deep learning model을 저장하는데 최적화 되어 있습니다.

상태를 dictionaly화 해서 저장하는 것이 틀립니다.

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.init as init
import torchvision
import torch.nn.functional as F

# https://pytorch.org/docs/stable/torchvision/datasets.html
# 파이토치에서는 torchvision.datasets에 MNIST 등의 다양한 데이터를 사용하기 용이하게 정리해놨습니다.
# 이를 사용하면 데이터를 따로 학습에 맞게 정리하거나 하지 않아도 바로 사용이 가능합니다.
import torchvision.datasets as dset

# https://pytorch.org/docs/stable/torchvision/transforms.html?highlight=transforms
# torchvision.transforms에는 이미지 데이터를 자르거나 확대 및 다양하게 변형시키는 함수들이 구현되어 있습니다. 
import torchvision.transforms as transforms

# https://pytorch.org/docs/stable/data.html?highlight=dataloader#torch.utils.data.DataLoader
# DataLoader는 전처리가 끝난 데이터들을 지정한 배치 크기에 맞게 모아서 전달해주는 역할을 합니다.
from torch.utils.data import DataLoader
import pandas as pd

import numpy as np
import matplotlib.pyplot as pltx

import plotly.graph_objects as go
import plotly.subplots as ms
import plotly.express as px
import plotly as plt
import pymysql
import pandas as pd
import numpy as np
import time
import talib
from PIL import Image
import io
import os.path as path
import joblib
import common.constrant as Const

batch_size = 200
learning_rate = 0.001
num_epoch = 100
data_size = 100
action_kind = 4
screen_height = 50
screen_width  = 70
qsize = 2048

df = pd.read_csv("data_all.dat")

class DQN(nn.Module):
    def __init__(self, h, w, outputs, qsize):
        super(DQN, self).__init__()
        self.conv1 = nn.Conv2d(4, h*w, kernel_size=5, stride=2)
        self.bn1 = nn.BatchNorm2d(h*w)
        self.conv2 = nn.Conv2d(h*w, qsize, kernel_size=5, stride=2)
        self.bn2 = nn.BatchNorm2d(qsize)
        self.conv3 = nn.Conv2d(qsize, qsize, kernel_size=5, stride=2)
        # self.bn3 = nn.BatchNorm2d(qsize)

        linear_input_size = 3 * qsize
        # self.head = nn.Linear(linear_input_size, outputs)
        self.head = nn.Sequential(
            nn.Linear(linear_input_size, qsize),
            nn.ReLU(),
            nn.Linear(qsize, qsize),
            nn.ReLU(),
            nn.Linear(qsize, outputs)
        )

    # Called with either one element to determine next action, or a batch
    # during optimization. Returns tensor([[left0exp,right0exp]...]).
    def forward(self, x):
        x = F.relu(self.bn1(self.conv1(x)))
        # print("x1------->", x.size())
        x = F.relu(self.bn2(self.conv2(x)))
        # print("x2------->", x.size())
        x = F.relu(self.bn3(self.conv3(x)))
        # print("x3------->", x.size())
        x = x.view(x.size(0), -1)
        # print("x4------->", x.size())
        x = self.head(x)
        # print("x5------->", x.size())
        return  x

def select_action(df, idx):
    print("idx{0:d} close:{1} closemin:{2} closemax:{3}".format(idx, df.loc[idx, "close"], df.loc[idx, "closemin"], df.loc[idx, "closemax"]))
    if df.loc[idx, "sma20deg"] >= 0 :
        action = 0 if df.loc[idx, "close"] <= df.loc[idx, "closemin"] else ((action_kind-1) if df.loc[idx, "close"] >= df.loc[idx, "closemax"] else 1)
    else:
        action = 0 if df.loc[idx, "close"] <= df.loc[idx, "closemin"] else ((action_kind-1) if df.loc[idx, "close"] >= df.loc[idx, "closemax"] else 2)
    return action

# gpu가 사용 가능한 경우에는 device를 gpu로 설정하고 불가능하면 cpu로 설정합니다.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

charts = []
charts = joblib.load("charts_{0:02d}_1.dmp".format(action_kind))
train_loader = DataLoader(charts,batch_size=batch_size, shuffle=True,num_workers=2,drop_last=True)
test_loader = DataLoader(charts,batch_size=batch_size, shuffle=False,num_workers=2,drop_last=True)

# 모델을 지정한 장치로 올립니다.
model = DQN(screen_height, screen_width, action_kind, qsize=qsize).to(device)
if path.exists("train_dqn_{0:02d}_{1}_{2:04d}.pt".format(action_kind, device, qsize)):
    model.load_state_dict(torch.load("train_dqn_{0:02d}_{1}_{2:04d}.pt".format(action_kind,device,qsize)))
    model.eval()
loss_func = nn.CrossEntropyLoss()

# 최적화함수로는 Adam을 사용합니다.
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
optimizer.zero_grad()

loss_arr =[]
model.train()
for i in range(num_epoch):
    for j,[image,label] in enumerate(train_loader):
        x = image.to(device).squeeze(1)
        y = label.type(torch.LongTensor).to(device)
        
        output = model.forward(x)
        loss = loss_func(output,y)
        loss.backward()
        optimizer.step()
        
        if j % 1000 == 0:
            print(loss.cpu().detach().numpy())
            loss_arr.append(loss.cpu().detach().numpy())
            torch.save(model.state_dict(),"train_dqn_{0:02d}_{1}_{2:04d}.pt".format(action_kind, device, qsize))

torch.save(model.state_dict(),"train_dqn_{0:02d}_{1}_{2:04d}.pt".format(action_kind, device, qsize))
# pltx.plot(loss_arr)
# pltx.show()

model.eval()
# 맞은 개수, 전체 개수를 저장할 변수를 지정합니다.
correct = 0
total = 0.000000000001

# 인퍼런스 모드를 위해 no_grad 해줍니다.
with torch.no_grad():
# 테스트로더에서 이미지와 정답을 불러옵니다.
    for image,label in test_loader:
        # 두 데이터 모두 장치에 올립니다.
        x = image.to(device).squeeze(1)
        y_= label.to(device)

        # 모델에 데이터를 넣고 결과값을 얻습니다.
        output = model.forward(x)
        
        # https://pytorch.org/docs/stable/torch.html?highlight=max#torch.max
        # torch.max를 이용해 최대 값 및 최대값 인덱스를 뽑아냅니다.
        # 여기서는 최대값은 필요없기 때문에 인덱스만 사용합니다.
        _,output_index = torch.max(output,1)
        
        # 전체 개수는 라벨의 개수로 더해줍니다.
        # 전체 개수를 알고 있음에도 이렇게 하는 이유는 batch_size, drop_last의 영향으로 몇몇 데이터가 잘릴수도 있기 때문입니다.
        total += label.size(0)
        
        # 모델의 결과의 최대값 인덱스와 라벨이 일치하는 개수를 correct에 더해줍니다.
        correct += (output_index == y_).sum().float()

    # 테스트 데이터 전체에 대해 위의 작업을 시행한 후 정확도를 구해줍니다.
    print("Accuracy of Test Data: {}%".format(100*correct/total))

이렇게 charts.dmp 파일을 여러개의 이름으로 만들고

학습함수에 넣어서 학습을 진행합니다.

자신이 가지고 있는 보드수 만큼만 프로세스를 뛰울 수 있으니 보드가 한장이면 위의 소스를 사용하시면 되고

혹시 여러장의 RTX보드를 가지고 있다면 model = nn.DataParallel(model, device_ids=[0,1]).to(device) 를 추가하는 것이 좋습니다.

(보드가 2장일 때, 보드가 3장일 경우는 [0,1]을 [0,1,2]로 바꾸면 됩니다.

단 보드는 같은 것을 사용하는 것이 좋습니다.

보드0과 보드1의 gpu core 용량이 75%이상 차이가 나면 실행할 수 없습니다.

위치는 현재 소스의 모델 바로 아래 입니다.

단 이 경우 model.load_state_dict 함수를 model.module.load_state_dict함수로 변경을 해야 합니다.

지금 저장되어 있는 pt파일의 경우 단순 모델을 기준으로 저장되어 있으므로 다른 PC나 혹시 싱글로 predict를 할 경우는 싱글보드에 맞게 저장이 되어야 합니다.

마찬가지로 저장 또한

torch.save(model.state_dict(),"train_dqn_{0:02d}_{1}_{2:04d}.pt".format(action_kind, device, qsize))

로 저장할 경우 model.module.state_dict()의 값이 저장이 되어야 합니다.

그렇지 않으면 어렵게 저장된 charts.dmp파일을 활용할 수가 없습니다.

학습은 0.00001정도 까지 진행을 하면 predict 확률은 거의 100% 정도 나올 것입니다.

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.init as init
import torchvision
import torch.nn.functional as F

# https://pytorch.org/docs/stable/torchvision/datasets.html
# 파이토치에서는 torchvision.datasets에 MNIST 등의 다양한 데이터를 사용하기 용이하게 정리해놨습니다.
# 이를 사용하면 데이터를 따로 학습에 맞게 정리하거나 하지 않아도 바로 사용이 가능합니다.
import torchvision.datasets as dset

# https://pytorch.org/docs/stable/torchvision/transforms.html?highlight=transforms
# torchvision.transforms에는 이미지 데이터를 자르거나 확대 및 다양하게 변형시키는 함수들이 구현되어 있습니다. 
import torchvision.transforms as transforms

# https://pytorch.org/docs/stable/data.html?highlight=dataloader#torch.utils.data.DataLoader
# DataLoader는 전처리가 끝난 데이터들을 지정한 배치 크기에 맞게 모아서 전달해주는 역할을 합니다.
from torch.utils.data import DataLoader
import pandas as pd

import numpy as np
import matplotlib.pyplot as pltx

import plotly.graph_objects as go
import plotly.subplots as ms
import plotly.express as px
import plotly as plt
import pymysql
import pandas as pd
import numpy as np
import time
import talib
from PIL import Image
import io
import os.path as path
import joblib
import common.constrant as Const

batch_size = 200
learning_rate = 0.001
num_epoch = 100
data_size = 100
action_kind = 5
screen_height = 50
screen_width  = 70
qsize = 2048

df = pd.read_csv("data_all.dat")

class Account():
    def __init__(self, df, origin) -> None:
        self.BASIC_FEES = 0.0005
        self.hold_score = 0
        self.byu_score = 0
        self.sell_score = 0        
        self.orgin_money = origin
        self.money = origin
        self.balance = origin
        self.unit  = 0
        self.buy_index = 0
        self.max_rate = 0
        self.df = df
        self.rate = 0
        self.old_rate = 0
        self.bak_unit = 0
        self.bak_balance = 0
    
    def reset(self):
        self.balance = self.orgin_money
        self.money = self.orgin_money
        self.old_rate = 0
        self.unit = 0
        self.rate = 0
        self.buy_index = 0
        
    def select(self, action:int): 
        ret_action = 0
        if self.unit > 0 :
            ret_action = 0 if action == 1 else action
        else:
            ret_action = 0 if action == 2 else action
        return ret_action
    
    def exec_action(self, action, idx):
        real_action = self.select(action)
        if real_action == 1:
            return self.unit_buy(idx), real_action
        elif real_action == 2:
            return self.unit_sell(idx), real_action
        else:
            return self.unit_hold(idx), real_action
    
    def unit_buy(self, index):
        self.back_up()
        buy_balance = self.balance
        self.buy_index = index
        self.old_rate = self.rate
        while True:
            buy_unit = buy_balance / self.df.loc[index, 'close']
            amount = (buy_unit * self.df.loc[index, 'close']) * self.BASIC_FEES + buy_unit * self.df.loc[index, 'close']
            if self.balance < amount :
                buy_balance -= self.balance * 0.0001
            else:
                self.balance -= amount
                self.unit = buy_unit
                self.rate = ((self.unit * self.df.loc[index, 'close'] + self.balance) - self.orgin_money) * 100 / self.orgin_money
                # if index%50 == 0 : print("buy index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
                print("buy index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
                break
        return self.rate
        
    def unit_sell(self, index):
        self.back_up()
        self.old_rate = self.rate
        sell_balance = self.unit * self.df.loc[index, 'close'] - (self.unit * self.df.loc[index, 'close']) * self.BASIC_FEES
        self.balance += sell_balance
        self.unit = 0
        self.rate = ((self.unit * self.df.loc[index, 'close'] + self.balance) - self.orgin_money) * 100 / self.orgin_money
        # if index%50 == 0 : print("sell index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
        print("sell index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
        self.money = self.balance
        self.max_rate = 0
        return self.rate
    
    def unit_hold(self, index):
        self.old_rate = self.rate
        self.rate = ((self.unit * self.df.loc[index, 'close'] + self.balance) - self.orgin_money) * 100 / self.orgin_money
        # print("index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
        return self.rate
    
    def get_newaction(self, index):
        if self.unit > 0:
            hold_rate = ((self.unit * self.df.loc[index, "sma3"] + self.balance) - self.orgin_money) * 100 / self.orgin_money
            sell_rate = ((self.unit * self.df.loc[index, "close"] + self.balance) - self.orgin_money) * 100 / self.orgin_money
            return 0 if hold_rate > sell_rate else 2
        else:
            buy_balance = self.balance
            self.buy_index = index
            self.old_rate = self.rate
            while True:
                buy_unit = buy_balance / self.df.loc[index, 'close']
                amount = (buy_unit * self.df.loc[index, 'close']) * self.BASIC_FEES + buy_unit * self.df.loc[index, 'close']
                if self.balance < amount :
                    buy_balance -= self.balance * 0.0001
                else:
                    self.balance -= amount
                    self.unit = buy_unit
                    buy_rate = ((self.unit * self.df.loc[index, 'close'] + self.balance) - self.orgin_money) * 100 / self.orgin_money
                    break
            hold_rate = (self.balance - self.orgin_money) * 100 / self.orgin_money
            return 0 if hold_rate > buy_rate else 1
        
    def back_up(self):
        self.bak_balance = self.balance
        self.bak_unit = self.unit

    def  back_ward(self):
        self.balance = self.bak_balance 
        self.unit = self.bak_unit
                
    def is_bankrupt(self):
        return (self.money < 10000)

start_idx = df.index[0]
print("start_idx[{}]".format(start_idx))
def get_chart(df, idx, max_data:int=300, i_w:int=140, i_h:int=100):
    ndf = df.head(idx - start_idx).tail(max_data)
    ndf.reset_index(drop=True, inplace=True)
    
    candle = go.Candlestick(x=ndf.index,open=ndf['open'],high=ndf['high'],low=ndf['low'],close=ndf['close'], increasing_line_color = 'red',decreasing_line_color = 'blue', showlegend=False)
    upper = go.Scatter(x=ndf.index, y=ndf['upper'], line=dict(color='red', width=2), name='upper', showlegend=False)
    ma20 = go.Scatter(x=ndf.index, y=ndf['ma20'], line=dict(color='black', width=2), name='ma20', showlegend=False)
    lower = go.Scatter(x=ndf.index, y=ndf['lower'], line=dict(color='blue', width=2), name='lower', showlegend=False)

    volume = go.Bar(x=ndf.index, y=ndf['volume'], marker_color='red', name='volume', showlegend=False)

    MACD = go.Scatter(x=ndf.index, y=ndf['macd'], line=dict(color='blue', width=2), name='MACD', legendgroup='group2', legendgrouptitle_text='MACD')
    MACD_Signal = go.Scatter(x=ndf.index, y=ndf['signal'], line=dict(dash='dashdot', color='green', width=2), name='MACD_Signal')
    MACD_Oscil = go.Bar(x=ndf.index, y=ndf['flag'], marker_color='purple', name='MACD_Oscil')

    fast_k = go.Scatter(x=ndf.index, y=ndf['fast_k'], line=dict(color='skyblue', width=2), name='fast_k', legendgroup='group3', legendgrouptitle_text='%K %D')
    slow_d = go.Scatter(x=ndf.index, y=ndf['slow_d'], line=dict(dash='dashdot', color='black', width=2), name='slow_d')

    PB = go.Scatter(x=ndf.index, y=ndf['PB']*100, line=dict(color='blue', width=2), name='PB', legendgroup='group4', legendgrouptitle_text='PB, MFI')
    MFI10 = go.Scatter(x=ndf.index, y=ndf['MFI10'], line=dict(dash='dashdot', color='green', width=2), name='MFI10')

    RSI = go.Scatter(x=ndf.index, y=ndf['rsi14'], line=dict(color='red', width=2), name='RSI', legendgroup='group5', legendgrouptitle_text='RSI')
    
    # 스타일
    fig = ms.make_subplots(rows=5, cols=2, specs=[[{'rowspan':4},{}],[None,{}],[None,{}],[None,{}],[{},{}]], shared_xaxes=True, horizontal_spacing=0.03, vertical_spacing=0.01)

    fig.add_trace(candle,row=1,col=1)
    fig.add_trace(upper,row=1,col=1)
    fig.add_trace(ma20,row=1,col=1)
    fig.add_trace(lower,row=1,col=1)

    fig.add_trace(volume,row=5,col=1)

    fig.add_trace(candle,row=1,col=2)
    fig.add_trace(upper,row=1,col=2)
    fig.add_trace(ma20,row=1,col=2)
    fig.add_trace(lower,row=1,col=2)

    fig.add_trace(MACD,row=2,col=2)
    fig.add_trace(MACD_Signal,row=2,col=2)
    fig.add_trace(MACD_Oscil,row=2,col=2)

    fig.add_trace(fast_k,row=3,col=2)
    fig.add_trace(slow_d,row=3,col=2)

    fig.add_trace(PB,row=4,col=2)
    fig.add_trace(MFI10,row=4,col=2)

    fig.add_trace(RSI,row=5,col=2)

    # 추세추종
    # trend_fol = 0
    # trend_refol = 0
    # for i in ndf.index:
    #     if ndf['PB'][i] > 0.8 and ndf['MFI10'][i] > 80:
    #         trend_fol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='orange', marker_size=20, marker_symbol='triangle-up', opacity=0.7, showlegend=False)
    #         fig.add_trace(trend_fol,row=1,col=1)
    #     elif ndf['PB'][i] < 0.2 and ndf['MFI10'][i] < 20:
    #         trend_fol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='darkblue', marker_size=20, marker_symbol='triangle-down', opacity=0.7, showlegend=False)
    #         fig.add_trace(trend_fol,row=1,col=1)

    # 역추세추종
    # for i in ndf.index:
    #     if ndf['PB'][i] < 0.05 and ndf['IIP21'][i] > 0:
    #         trend_refol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='purple', marker_size=20, marker_symbol='triangle-up', opacity=0.7, showlegend=False)  #보라
    #         fig.add_trace(trend_refol,row=1,col=1)
    #     elif df['PB'][i] > 0.95 and ndf['IIP21'][i] < 0:
    #         trend_refol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='skyblue', marker_size=20, marker_symbol='triangle-down', opacity=0.7, showlegend=False)  #하늘
    #         fig.add_trace(trend_refol,row=1,col=1)    

    # fig.add_trace(trend_fol,row=1,col=1)
    # 추세추총전략을 통해 캔들차트에 표시합니다.

    # fig.add_trace(trend_refol,row=1,col=1)
    # 역추세 전략을 통해 캔들차트에 표시합니다.
    
    # fig.update_layout(autosize=True, xaxis1_rangeslider_visible=False, xaxis2_rangeslider_visible=False, margin=dict(l=50,r=50,t=50,b=50), template='seaborn', title=f'({ticker})의 날짜: ETH [추세추종전략:오↑파↓] [역추세전략:보↑하↓]')
    # fig.update_xaxes(tickformat='%y년%m월%d일', zeroline=True, zerolinewidth=1, zerolinecolor='black', showgrid=True, gridwidth=2, gridcolor='lightgray', showline=True,linewidth=2, linecolor='black', mirror=True)
    # fig.update_yaxes(tickformat=',d', zeroline=True, zerolinewidth=1, zerolinecolor='black', showgrid=True, gridwidth=2, gridcolor='lightgray',showline=True,linewidth=2, linecolor='black', mirror=True)
    # fig.update_traces(xhoverformat='%y년%m월%d일')
    # size = len(img)
    # img = plt.io.to_image(fig, format='png')
    # canvas = FigureCanvasBase(fig)
    # img = Image.frombytes(mode='RGB', size=(700, 500), data=fig.to_image().to, decoder_name='raw')
    # img = Image.frombytes('RGBA', (700, 500), plt.io.to_image(fig, format='png'), 'raw')
    # img = Image.fromarray('RGBA', (700, 500), np.array(plt.io.to_image(fig, format='png')), 'raw')
    # img = Image.fromarray(np.array(plt.io.to_image(fig, format='png')), 'RGB')
    # img = Image.fromarray(np.array(plt.io.to_image(fig, format='png')), 'L')
    # img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png', width=140, height=100)))
    # img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png', width=700, height=500)))
    img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png')))
    # print(img)
    img.convert("RGB")
    # print(img)
    img.thumbnail((i_h, i_w), Image.LANCZOS)
    # print(img)
    # img = Image.Image(img, 'RGB')
    # img.show()
    return img

class DQN(nn.Module):
    def __init__(self, h, w, outputs, qsize):
        super(DQN, self).__init__()
        self.conv1 = nn.Sequential(
            nn.ReLU(),
            nn.Conv2d(4, h*w, kernel_size=5, stride=2),
            nn.BatchNorm2d(h*w)
        )
        self.conv2 = nn.Sequential(
            nn.ReLU(),
            nn.Conv2d(h*w, qsize, kernel_size=5, stride=2),
            nn.BatchNorm2d(qsize),
            # nn.MaxPool2d(kernel_size=2,stride=2)
        )
        self.conv3 = nn.Sequential(
            nn.ReLU(),
            nn.Conv2d(qsize, qsize, kernel_size=5, stride=2),
            nn.BatchNorm2d(qsize),
            nn.Softmax()
            # nn.MaxPool2d(kernel_size=2,stride=2)
        )
        # self.mp3 = nn.MaxPool2d(kernel_size=2,stride=2)                                
        # # [batch_size,64,6,6] -> [batch_size,64,3,3]

        linear_input_size = 3 * qsize
        self.head = nn.Linear(linear_input_size, outputs)
        # self.head = nn.Sequential(
        #     nn.Linear(linear_input_size, qsize),
        #     nn.ReLU(),
        #     nn.Linear(qsize, qsize),
        #     nn.ReLU(),
        #     nn.Linear(qsize, outputs)
        # )

    # Called with either one element to determine next action, or a batch
    # during optimization. Returns tensor([[left0exp,right0exp]...]).
    def forward(self, x):
        x = self.conv1(x)
        print("x1------->", x.size())
        x = self.conv2(x)
        print("x2------->", x.size())
        x = self.conv3(x)
        print("x3------->", x.size())
        x = x.view(x.size(0), -1)
        print("x4------->", x.size())
        x = self.head(x)
        print("x5------->", x.size())
        print(x)
        x = torch.argmax(x, dim= -1)
        print(x)
        print("x6------->", x.size())
        return  x

def select_action(df, idx):
    print("idx{0:d} close:{1} closemin:{2} closemax:{3}".format(idx, df.loc[idx, "close"], df.loc[idx, "closemin"], df.loc[idx, "closemax"]))
    if df.loc[idx, "sma20deg"] >= 0 :
        action = 0 if df.loc[idx, "close"] <= df.loc[idx, "closemin"] else ((action_kind-1) if df.loc[idx, "close"] >= df.loc[idx, "closemax"] else 1)
    else:
        action = 0 if df.loc[idx, "close"] <= df.loc[idx, "closemin"] else ((action_kind-1) if df.loc[idx, "close"] >= df.loc[idx, "closemax"] else 2)
    return action

# gpu가 사용 가능한 경우에는 device를 gpu로 설정하고 불가능하면 cpu로 설정합니다.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)
account = Account(df, 50000000)
account.reset()

converter = torchvision.transforms.ToTensor()

# 모델을 지정한 장치로 올립니다.
model = DQN(screen_height, screen_width, action_kind, qsize=qsize).to(device)
if path.exist("train_dqn_{0:02d}_{1}_{2:04d}.pt".format(action_kind, device, qsize)):
    torch.load(model.state_dict(), "train_dqn_{0:02d}_{1}_{2:04d}.pt".format(action_kind,device,qsize))
model.eval()

with torch.no_grad():
    tot = 0.0000000000001
    corr = 0.0
    predict_arr = []
    for idy in df.index:
        x = get_chart(df, idy, data_size, i_w = screen_width, i_h = screen_height)
        x = converter(x).unsqueeze(0).to(device).squeeze(1)

        output = model.forward(x)

        _,action = torch.max(output,1)
        # print("action:", action)
        action = action.cpu().numpy()[0]
        # print("action:", action)
        predict_arr[action] += 1
        sel_act = select_action(df, idy)
        tot += 1 if sel_act in [0, action_kind-1] or action in [0, action_kind-1] else 0
        corr += 1 if (sel_act in [0, action_kind-1] or action in [0, action_kind-1]) and sel_act == action else 0
        reward, real_action = account.exec_action(2 if action == (action_kind - 1) else (1 if action == 0 else 0) , idy)
        if idy % 100 == 0:
            print("progress {0:.2f}".format(idy * 100 / df.index.max()))
            print("idy:%d action[%d:%d] price [%.4f] unit[%.4f] agent rate:%.05f remind money:%.02f accuracy:%.02f" 
                % (idy, sel_act, action, df.loc[idy, 'close'], account.unit, account.rate, account.balance + account.unit * df.loc[idy, 'close'], 100*corr/tot))
        print("prediction count")
        for act in predict_arr:
            print("action{0:02d}  ==>  {1:d}".format(act, predict_arr[act]))

실제 데이터를 이용하여 정확도를 재측정합니다.

실제 시스템에 적용할지 말지는 본인이 판단해야 할 것 같습니다.

모델이 잘 작동하지 않거나 차수가 다르게 나오는 등의 문제가 발생할 수 있습니다.

댓글로 문의 주시면 상세히 답변 드리겠습니다.

Good Luck!!!

LIST

저작자표시 비영리

'python > 자동매매 프로그램' 카테고리의 다른 글

DQN network 수정 (0)	2024.03.12
Colab에서 살아남기 (3)	2023.03.07
강화학습을 이용한 비트코인 매매프로그램(13) - 강화학습 최적화 (1)	2023.02.14
강화학습을 이용한 비트코인 매매프로그램(12) - 실거래 적용 (5)	2023.01.30
강화학습을 이용한 비트코인 매매프로그램(11) - ResNet + RNN 적용 모델 (1)	2022.12.25

강화학습을 이용한 비트코인 매매프로그램(13) - 강화학습 최적화

아빠는 벌레잡이 2023. 2. 14. 23:54

2023. 2. 14. 23:54

데이터 과학자들은 수세기 동안 지금의 RDBMS와는 다른 개념의 데이터베이스를 만들고자 했습니다.
예를 들면 일기 예보같은 데이타를 전부 모은 데이터베이스를 만드는 것입니다.
지금 현재의 기온,습도,바람의 세기,강수량으로 내일의 날씨를 맞추려고 했습니다.
하지만 지금의 RDBMS는 데이터를 정렬하고 결과를 알아내는데 (데이터가 많아지면 많아질수록)불가능하다는 것을 알게 했습니다.
그러나 과학자들은 계속 고집을 부렸습니다.
하둡을 만들어서 더 많은 데이터를 쌓으려 하고 망고DB로 RDBMS의 릴레이션을 깨트리려 했습니다.
데이터 분석을 위해 비주얼라이제이션을 학문화하고 R이라는 언어도 만들었습니다.
그리고 파이썬을 개선해나갔습니다.
그럴수록 이것이 불가능하다는 것을 알게 됩니다.
그래서 한가지 아이디어를 냅니다.
딥런닝으로 모든 데이터를 학습해서 모델을 만들고 모델을 가지고서 답을 찾는거죠.
이제 수초 정도면 내일의 날씨를 예측할 수 있습니다.
슈퍼컴이 아니어도 일반컴으로도 그런 프로그램을 만들 수 있습니다.
이제 우리는 모든 종류의 데이타를 가지고 있습니다.
또 모든 데이터를 (기계가) 학습할 수 있습니다.
아직 고전적 방법의 데이터베이스를 필요로 하는 분야도 있지만 지금은 딥런닝이 만든 모델을 이용한 이 요상한 데이터베이스 방식을 더 선호합니다.
예를 들면 바둑기보를 다 저장한 컴퓨터가 그 데이터를 기반으로 다음 수를 찾는 다면 아마도 시간초과로 사람에게 질겁니다.
지금은 딥런링으로 학습된 데이터베이스를 가지고 있습니다.
슈퍼컴이 아닌 수백대의 컴퓨터가 분산작업을 하고 결과를 냅니다.
우리의 매매프로그램도 마찬가지입니다.
50만장의 차트이미지를 비지도 학습으로 군단화합니다.
결과가 1000가지가 될지 10000가지가 넘을지 아니면 100가지 일지는 알 수 없습니다.
그만큼 이 이미지들의 군단이 들어갈 구멍 즉 qsize가 커야 합니다.
메모리가 허용하는한 최대값으로 해도 상관이 없습니다.
그러나 학습이 완료되게 하려면 자신에 맞게 그 값을 찾아야 합니다.
124정도의 값을 하면 무리가 없을 듯 합니다.
이제 학습률을 조정합니다.
학습률은 경사하강법시 얼마나 Jump up할지의 값(뛸지의 값)이라고 보면 됩니다.
이 값이 크지면 로스(loss)그래프는 뽀족한 파형을 만들 것이고 이 값이 적다면 로스는 처음은 떨어지다가 나중에는 점점 커지게 될 것입니다.
적당한 값이라면 역L자의 그래프를 그릴 것 입니다.
학습된 모델은 pt파일로 저장하므로 처음은 0.001정도로 셋팅을 하여 학습을 진행하다가 로스값이 값이 더 이상 줄지않고
뽀족한 파장을 보인다면 학습을 중단한(ctrl + c 또는 ctrl + z 후 kill %1-linux라면) 다음 0.0001(더 학습이 필요하다면 * 0.001을 꼽하여 학습률을 더 작게 만듭니다.) 정도로 낮추어 학습을 진행해서 로스값이 0에 최대로 가까이 가게 학습합니다.
정확도가 몇프로 정도인지 측정하데 action_kind의 0과 n-1은 무조건 맞아야 하므로 그 값들의 정확도를 다시 측정합니다.
아래 소스는 자신에게 맞게 약간의 수정이 필요합니다.
난이도는 거의 제로라고 보시면 됩니다.

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.init as init
import torchvision
import torch.nn.functional as F

# https://pytorch.org/docs/stable/torchvision/datasets.html
# 파이토치에서는 torchvision.datasets에 MNIST 등의 다양한 데이터를 사용하기 용이하게 정리해놨습니다.
# 이를 사용하면 데이터를 따로 학습에 맞게 정리하거나 하지 않아도 바로 사용이 가능합니다.
import torchvision.datasets as dset

# https://pytorch.org/docs/stable/torchvision/transforms.html?highlight=transforms
# torchvision.transforms에는 이미지 데이터를 자르거나 확대 및 다양하게 변형시키는 함수들이 구현되어 있습니다. 
import torchvision.transforms as transforms

# https://pytorch.org/docs/stable/data.html?highlight=dataloader#torch.utils.data.DataLoader
# DataLoader는 전처리가 끝난 데이터들을 지정한 배치 크기에 맞게 모아서 전달해주는 역할을 합니다.
from torch.utils.data import DataLoader
import pandas as pd

import numpy as np
import matplotlib.pyplot as pltx

import plotly.graph_objects as go
import plotly.subplots as ms
import plotly.express as px
import plotly as plt
import pymysql
import pandas as pd
import numpy as np
import time
import talib
from PIL import Image
import io
import os.path as path
import joblib

ticker = 'ETH'
print(torch.__version__)

batch_size = 500
learning_rate = 0.002
num_epoch = 100
data_size = 100
action_kind = 5
screen_height = 50
screen_width  = 70
qsize = 512

df = pd.read_csv("data_all.dat")
# df = df.tail((df.index.max() // 5) * 5)
df = df.head((df.index.max() // 5) * 5)
# df = df.head(120)
# df = df.tail(250000)

def get_chart(df, idx, max_data:int=300, i_w:int=140, i_h:int=100):
    ndf = df.head(idx).tail(max_data)
    ndf.reset_index(drop=True, inplace=True)
    
    candle = go.Candlestick(x=ndf.index,open=ndf['open'],high=ndf['high'],low=ndf['low'],close=ndf['close'], increasing_line_color = 'red',decreasing_line_color = 'blue', showlegend=False)
    upper = go.Scatter(x=ndf.index, y=ndf['upper'], line=dict(color='red', width=2), name='upper', showlegend=False)
    ma20 = go.Scatter(x=ndf.index, y=ndf['ma20'], line=dict(color='black', width=2), name='ma20', showlegend=False)
    lower = go.Scatter(x=ndf.index, y=ndf['lower'], line=dict(color='blue', width=2), name='lower', showlegend=False)

    volume = go.Bar(x=ndf.index, y=ndf['volume'], marker_color='red', name='volume', showlegend=False)

    MACD = go.Scatter(x=ndf.index, y=ndf['macd'], line=dict(color='blue', width=2), name='MACD', legendgroup='group2', legendgrouptitle_text='MACD')
    MACD_Signal = go.Scatter(x=ndf.index, y=ndf['signal'], line=dict(dash='dashdot', color='green', width=2), name='MACD_Signal')
    MACD_Oscil = go.Bar(x=ndf.index, y=ndf['flag'], marker_color='purple', name='MACD_Oscil')

    fast_k = go.Scatter(x=ndf.index, y=ndf['fast_k'], line=dict(color='skyblue', width=2), name='fast_k', legendgroup='group3', legendgrouptitle_text='%K %D')
    slow_d = go.Scatter(x=ndf.index, y=ndf['slow_d'], line=dict(dash='dashdot', color='black', width=2), name='slow_d')

    PB = go.Scatter(x=ndf.index, y=ndf['PB']*100, line=dict(color='blue', width=2), name='PB', legendgroup='group4', legendgrouptitle_text='PB, MFI')
    MFI10 = go.Scatter(x=ndf.index, y=ndf['MFI10'], line=dict(dash='dashdot', color='green', width=2), name='MFI10')

    RSI = go.Scatter(x=ndf.index, y=ndf['rsi14'], line=dict(color='red', width=2), name='RSI', legendgroup='group5', legendgrouptitle_text='RSI')
    
    # 스타일
    fig = ms.make_subplots(rows=5, cols=2, specs=[[{'rowspan':4},{}],[None,{}],[None,{}],[None,{}],[{},{}]], shared_xaxes=True, horizontal_spacing=0.03, vertical_spacing=0.01)

    fig.add_trace(candle,row=1,col=1)
    fig.add_trace(upper,row=1,col=1)
    fig.add_trace(ma20,row=1,col=1)
    fig.add_trace(lower,row=1,col=1)

    fig.add_trace(volume,row=5,col=1)

    fig.add_trace(candle,row=1,col=2)
    fig.add_trace(upper,row=1,col=2)
    fig.add_trace(ma20,row=1,col=2)
    fig.add_trace(lower,row=1,col=2)

    fig.add_trace(MACD,row=2,col=2)
    fig.add_trace(MACD_Signal,row=2,col=2)
    fig.add_trace(MACD_Oscil,row=2,col=2)

    fig.add_trace(fast_k,row=3,col=2)
    fig.add_trace(slow_d,row=3,col=2)

    fig.add_trace(PB,row=4,col=2)
    fig.add_trace(MFI10,row=4,col=2)

    fig.add_trace(RSI,row=5,col=2)

    # 추세추종
    # trend_fol = 0
    # trend_refol = 0
    # for i in ndf.index:
    #     if ndf['PB'][i] > 0.8 and ndf['MFI10'][i] > 80:
    #         trend_fol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='orange', marker_size=20, marker_symbol='triangle-up', opacity=0.7, showlegend=False)
    #         fig.add_trace(trend_fol,row=1,col=1)
    #     elif ndf['PB'][i] < 0.2 and ndf['MFI10'][i] < 20:
    #         trend_fol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='darkblue', marker_size=20, marker_symbol='triangle-down', opacity=0.7, showlegend=False)
    #         fig.add_trace(trend_fol,row=1,col=1)

    # 역추세추종
    # for i in ndf.index:
    #     if ndf['PB'][i] < 0.05 and ndf['IIP21'][i] > 0:
    #         trend_refol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='purple', marker_size=20, marker_symbol='triangle-up', opacity=0.7, showlegend=False)  #보라
    #         fig.add_trace(trend_refol,row=1,col=1)
    #     elif df['PB'][i] > 0.95 and ndf['IIP21'][i] < 0:
    #         trend_refol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='skyblue', marker_size=20, marker_symbol='triangle-down', opacity=0.7, showlegend=False)  #하늘
    #         fig.add_trace(trend_refol,row=1,col=1)    

    # fig.add_trace(trend_fol,row=1,col=1)
    # 추세추총전략을 통해 캔들차트에 표시합니다.

    # fig.add_trace(trend_refol,row=1,col=1)
    # 역추세 전략을 통해 캔들차트에 표시합니다.
    
    # fig.update_layout(autosize=True, xaxis1_rangeslider_visible=False, xaxis2_rangeslider_visible=False, margin=dict(l=50,r=50,t=50,b=50), template='seaborn', title=f'({ticker})의 날짜: ETH [추세추종전략:오↑파↓] [역추세전략:보↑하↓]')
    # fig.update_xaxes(tickformat='%y년%m월%d일', zeroline=True, zerolinewidth=1, zerolinecolor='black', showgrid=True, gridwidth=2, gridcolor='lightgray', showline=True,linewidth=2, linecolor='black', mirror=True)
    # fig.update_yaxes(tickformat=',d', zeroline=True, zerolinewidth=1, zerolinecolor='black', showgrid=True, gridwidth=2, gridcolor='lightgray',showline=True,linewidth=2, linecolor='black', mirror=True)
    # fig.update_traces(xhoverformat='%y년%m월%d일')
    # size = len(img)
    # img = plt.io.to_image(fig, format='png')
    img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png')))
    img.convert("RGB")
    img.thumbnail((i_h, i_w), Image.LANCZOS)
    return img

class Account():
    def __init__(self, df, origin) -> None:
        self.BASIC_FEES = 0.0005
        self.hold_score = 0
        self.byu_score = 0
        self.sell_score = 0        
        self.orgin_money = origin
        self.money = origin
        self.balance = origin
        self.unit  = 0
        self.buy_index = 0
        self.max_rate = 0
        self.df = df
        self.rate = 0
        self.old_rate = 0
        self.bak_unit = 0
        self.bak_balance = 0
    
    def reset(self):
        self.balance = self.orgin_money
        self.money = self.orgin_money
        self.old_rate = 0
        self.unit = 0
        self.rate = 0
        self.buy_index = 0
        
    def select(self, action:int): 
        ret_action = 0
        if self.unit > 0 :
            ret_action = 0 if action == 1 else action
        else:
            ret_action = 0 if action == 2 else action
        return ret_action
    
    def exec_action(self, action, idx):
        real_action = self.select(action)
        if real_action == 1:
            return self.unit_buy(idx), real_action
        elif real_action == 2:
            return self.unit_sell(idx), real_action
        else:
            return self.unit_hold(idx), real_action
    
    def unit_buy(self, index):
        self.back_up()
        buy_balance = self.balance
        self.buy_index = index
        self.old_rate = self.rate
        while True:
            buy_unit = buy_balance / self.df.loc[index, 'close']
            amount = (buy_unit * self.df.loc[index, 'close']) * self.BASIC_FEES + buy_unit * self.df.loc[index, 'close']
            if self.balance < amount :
                buy_balance -= self.balance * 0.0001
            else:
                self.balance -= amount
                self.unit = buy_unit
                self.rate = ((self.unit * self.df.loc[index, 'close'] + self.balance) - self.orgin_money) * 100 / self.orgin_money
                # if index%50 == 0 : print("buy index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
                print("buy index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
                break
        return self.rate
        
    def unit_sell(self, index):
        self.back_up()
        self.old_rate = self.rate
        sell_balance = self.unit * self.df.loc[index, 'close'] - (self.unit * self.df.loc[index, 'close']) * self.BASIC_FEES
        self.balance += sell_balance
        self.unit = 0
        self.rate = ((self.unit * self.df.loc[index, 'close'] + self.balance) - self.orgin_money) * 100 / self.orgin_money
        # if index%50 == 0 : print("sell index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
        print("sell index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
        self.money = self.balance
        self.max_rate = 0
        return self.rate
    
    def unit_hold(self, index):
        self.old_rate = self.rate
        self.rate = ((self.unit * self.df.loc[index, 'close'] + self.balance) - self.orgin_money) * 100 / self.orgin_money
        # print("index[{}]hold unit[{:,.6f}] remind money[{:,.6f}] rate[{:,.8f}] expected[{:,.8f}]".format(index, self.unit, self.balance, self.rate, (self.unit * self.df.loc[index, 'close'] + self.balance)))
        return self.rate
    
    def get_newaction(self, index):
        if self.unit > 0:
            hold_rate = ((self.unit * self.df.loc[index, "sma3"] + self.balance) - self.orgin_money) * 100 / self.orgin_money
            sell_rate = ((self.unit * self.df.loc[index, "close"] + self.balance) - self.orgin_money) * 100 / self.orgin_money
            return 0 if hold_rate > sell_rate else 2
        else:
            buy_balance = self.balance
            self.buy_index = index
            self.old_rate = self.rate
            while True:
                buy_unit = buy_balance / self.df.loc[index, 'close']
                amount = (buy_unit * self.df.loc[index, 'close']) * self.BASIC_FEES + buy_unit * self.df.loc[index, 'close']
                if self.balance < amount :
                    buy_balance -= self.balance * 0.0001
                else:
                    self.balance -= amount
                    self.unit = buy_unit
                    buy_rate = ((self.unit * self.df.loc[index, 'close'] + self.balance) - self.orgin_money) * 100 / self.orgin_money
                    break
            hold_rate = (self.balance - self.orgin_money) * 100 / self.orgin_money
            return 0 if hold_rate > buy_rate else 1
        
    def back_up(self):
        self.bak_balance = self.balance
        self.bak_unit = self.unit

    def  back_ward(self):
        self.balance = self.bak_balance 
        self.unit = self.bak_unit
                
    def is_bankrupt(self):
        return (self.money < 10000)

class DQN(nn.Module):
    def __init__(self, h, w, outputs, qsize):
        super(DQN, self).__init__()
        self.conv1 = nn.Conv2d(4, h*w, kernel_size=5, stride=2)
        self.bn1 = nn.BatchNorm2d(h*w)
        self.conv2 = nn.Conv2d(h*w, qsize, kernel_size=5, stride=2)
        self.bn2 = nn.BatchNorm2d(qsize)
        self.conv3 = nn.Conv2d(qsize, qsize, kernel_size=5, stride=2)
        self.bn3 = nn.BatchNorm2d(qsize)

        # Number of Linear input connections depends on output of conv2d layers
        # and therefore the input image size, so compute it.
        # def conv2d_size_out(size, kernel_size = 5, stride = 2):
        #     return (size - (kernel_size - 1) - 1) // stride  + 1
        # convw = conv2d_size_out(conv2d_size_out(conv2d_size_out(w)))
        # convh = conv2d_size_out(conv2d_size_out(conv2d_size_out(h)))
        # print("convw[%d]  convh[%d]" % (convw, convh))
        linear_input_size = 3 * qsize
        self.head = nn.Linear(linear_input_size, outputs)

    # Called with either one element to determine next action, or a batch
    # during optimization. Returns tensor([[left0exp,right0exp]...]).
    def forward(self, x):
        x = F.relu(self.bn1(self.conv1(x)))
        x = F.relu(self.bn2(self.conv2(x)))
        x = F.relu(self.bn3(self.conv3(x)))
        return self.head(x.view(x.size(0), -1))

def select_action(df, idx):
    return ((df.loc[idx, "close"] - df.loc[idx, "closemin"]) * (action_kind-1) // (df.loc[idx, "closemax"] - df.loc[idx, "closemin"]))

# gpu가 사용 가능한 경우에는 device를 gpu로 설정하고 불가능하면 cpu로 설정합니다.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)
account = Account(df, 50000000)
account.reset()

converter = torchvision.transforms.ToTensor()
charts = []
for idx in df.index:
    if idx <= data_size:
        continue
    else:
        img = get_chart(df, idx, data_size, i_w = screen_width, i_h = screen_height)
        act = select_action(df, idx)
        # display(img)
        img = converter(img).unsqueeze(0)
        act = torch.tensor(act, dtype=torch.int64)
        charts.append([img,act])

joblib.dump(charts, "charts_{0:d}.dmp".format(action_kind))
train_loader = DataLoader(charts,batch_size=batch_size, shuffle=True,num_workers=2,drop_last=True)
test_loader = DataLoader(charts,batch_size=batch_size, shuffle=False,num_workers=2,drop_last=True)

# 모델을 지정한 장치로 올립니다.
model = DQN(screen_height, screen_width, action_kind, qsize).to(device)

if path.exists("pt/train_dqn_{0:02d}_{1}.pt".format(action_kind, device)):
    model.load_state_dict(torch.load("pt/train_dqn_{0:02d}_{1}.pt".format(action_kind, device)))
model.train()
# 손실함수로는 크로스엔트로피를 사용합니다.
loss_func = nn.CrossEntropyLoss()

# 최적화함수로는 Adam을 사용합니다.
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

loss_arr =[]
for i in range(num_epoch):
    for j,[image,label] in enumerate(train_loader):
        x = image.to(device).squeeze(1)
        # print(x)
        y_= label.to(device)
        # print(y_)
        
        optimizer.zero_grad()
        output = model.forward(x)
        loss = loss_func(output,y_)
        loss.backward()
        optimizer.step()
        
        if j % 100 == 0:
            print(loss.cpu().detach().numpy())
            loss_arr.append(loss.cpu().detach().numpy())

pltx.plot(loss_arr)
pltx.show()

# 맞은 개수, 전체 개수를 저장할 변수를 지정합니다.
correct = 0
total = 0.000000000001

# 인퍼런스 모드를 위해 no_grad 해줍니다.
model.eval()
with torch.no_grad():
# 테스트로더에서 이미지와 정답을 불러옵니다.
    for image,label in test_loader:
        # 두 데이터 모두 장치에 올립니다.
        x = image.to(device).squeeze(1)
        y_= label.to(device)

        # 모델에 데이터를 넣고 결과값을 얻습니다.
        output = model.forward(x)
        
        # https://pytorch.org/docs/stable/torch.html?highlight=max#torch.max
        # torch.max를 이용해 최대 값 및 최대값 인덱스를 뽑아냅니다.
        # 여기서는 최대값은 필요없기 때문에 인덱스만 사용합니다.
        _,output_index = torch.max(output,1)
        
        # 전체 개수는 라벨의 개수로 더해줍니다.
        # 전체 개수를 알고 있음에도 이렇게 하는 이유는 batch_size, drop_last의 영향으로 몇몇 데이터가 잘릴수도 있기 때문입니다.
        total += label.size(0)
        
        # 모델의 결과의 최대값 인덱스와 라벨이 일치하는 개수를 correct에 더해줍니다.
        correct += (output_index == y_).sum().float()

    # 테스트 데이터 전체에 대해 위의 작업을 시행한 후 정확도를 구해줍니다.
    print("Accuracy of Test Data: {}%".format(100*correct/total))
    torch.save(model.state_dict(),"pt/train_dqn_{0:02d}_{1}.pt".format(action_kind, device))

    tot = 0.0000000000001
    corr = 0.0
    for idy in df.index:
        x = get_chart(df, idy, data_size, i_w = screen_width, i_h = screen_height)
        x = converter(x).unsqueeze(0).to(device).squeeze(1)

        output = model.forward(x)

        _,action = torch.max(output,1)
        # print("action:", action)
        action = action.cpu().numpy()[0]
        # print("action:", action)
        sel_act = select_action(df, idy)
        tot += 1 if sel_act in [0, action_kind-1] else 0
        corr += 1 if sel_act in [0, action_kind-1] and sel_act == action else 0
        reward, real_action = account.exec_action(2 if action == (action_kind - 1) else (1 if action == 0 else 0) , idy)
        if idy % 100 == 0:
            print("idy:%d action[%d:%d] price [%.4f] unit[%.4f] agent rate:%.05f remind money:%.02f accuracy:%.02f" 
                % (idy, sel_act, action, df.loc[idy, 'close'], account.unit, account.rate, account.balance + account.unit * df.loc[idy, 'close'], 100*corr/tot))

LIST

'python > 자동매매 프로그램' 카테고리의 다른 글

Colab에서 살아남기 (3)	2023.03.07
강화학습을 이용한 비트코인 매매프로그램(14) - 학습 파일 단위 분리 (5)	2023.02.25
강화학습을 이용한 비트코인 매매프로그램(12) - 실거래 적용 (5)	2023.01.30
강화학습을 이용한 비트코인 매매프로그램(11) - ResNet + RNN 적용 모델 (1)	2022.12.25
강화학습을 이용한 비트코인 매매프로그램(7)-Back Test (1)	2022.12.24

강화학습을 이용한 비트코인 매매프로그램(12) - 실거래 적용

아빠는 벌레잡이 2023. 1. 30. 07:34

2023. 1. 30. 07:34

from lib.UpbitTrade import UpbitTrade
from lib.ColoredLogger import COLORS_LOG as cl
from lib.Upsert import aio_data_update
# from lib.upbitSellBuy import *
from lib.dl_upbitSellBuy import predict
from lib.Common import send_slack
from lib.Common import make_real_rate
from lib.Common import dfTrend
from lib.Common import dfInclease
import pyupbit
import time
import sys
import os
import aiomysql
import pymysql
import asyncio

import logging
import logging.config
import json

import common.constrant as Const
import config.localconf as conf
import pandas as pd
import numpy as np

config = json.load(open('config/logging.json'))
logging.config.dictConfig(config)
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

table_name="xxx.TB_ETH_TRADE"
table_report="xxx.TB_ETH_DAILY_REPORT"
table_config="xxx.TB_ETH_CASH_CONFIG"
table_crashed="xxx.TB_CRASHED_REPORT"
table_realrate="xxx.TB_REAL_REPORT"

pd.set_option('display.max_columns', 13)
pd.set_option('display.width', 200)

def get_avg_buy_price():
    try : 
        amount = upbit.get_amount('ALL')
        unit = trade.get_balance()[0]
        if unit == 0:
            return 0
        else:
            return amount / unit
    except Exception as ex:
        exc_type, exc_obj, exc_tb = sys.exc_info()
        fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
        logger.error("get_avg_buy_price exception! %s ` : %s %d" % (str(ex) , fname, exc_tb.tb_lineno))
        return 0

def get_avg_sell_price():
    try:
        balances, req = upbit.get_balances(contain_req=True)
        if len(balances) == 0:
            return 0

        avg_sell_price = 0
        for x in balances:
            if x['currency'] == conf.TRADE_TICKER:
                avg_sell_price = float(x['avg_sell_price'])
                break
        return avg_sell_price
    except Exception as x:
        exc_type, exc_obj, exc_tb = sys.exc_info()
        fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
        logger.error("get_avg_sell_price exception! %s ` : %s %d" % (str(ex) , fname, exc_tb.tb_lineno))
        return 0

async def read_all(sqlcon) -> pd.DataFrame:
    #read Backup DB 
    try:
        cursor = await sqlcon.cursor(aiomysql.cursors.DictCursor)
        str_query = "select * from %s" % table_name
        await cursor.execute(str_query)
        data = await cursor.fetchall()
        datadf = pd.DataFrame(data)
        return datadf
    except Exception as ex:
        exc_type, exc_obj, exc_tb = sys.exc_info()
        fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
        logger.error("read from db exception! %s ` : %s %d" % (str(ex) , fname, exc_tb.tb_lineno))
    finally:
        await cursor.close()

async def read_from(sqlcon, ntime) -> pd.DataFrame:
    try:
        cursor = await sqlcon.cursor(aiomysql.cursors.DictCursor)
        str_query = "select * from %s where `time` > %d" % (table_name, ntime)
        # logger.info("str_query:%s", str_query)
        await cursor.execute(str_query)
        data = await cursor.fetchall()
        datadf = pd.DataFrame(data)
        # logger.info("return df")
        # logger.info(datadf)
        return datadf
    except Exception as ex:
        exc_type, exc_obj, exc_tb = sys.exc_info()
        fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
        logger.error("read from db exception! %s ` : %s %d" % (str(ex) , fname, exc_tb.tb_lineno))
    finally:
        await cursor.close()

async def trade_proc():
    last_time = ""
    krw = 0
    first_time = True
    avg_buy_price = 0.0
    avg_sell_price = 0.0
    slack_flag = 0
    old_rate = 0
    rate_trend = ""

    try:
        pool = await aiomysql.create_pool(host="193.xxx.xxx.xx", user="xxxxxx", password='xxxxxxx', db='xxxxx')
        async with pool.acquire() as sqlcon:
            
            df = await read_all(sqlcon)
            idx = df.index.max()
            idx_time = df.loc[idx, 'time']
            sqlcon.close()
        while True:
            async with pool.acquire() as sqlcon:
                try:
                    await get_trade_yn(sqlcon)
                    start_time = time.time()

                    add_df = await read_from(sqlcon, idx_time)
                    df = pd.concat([df, add_df])
                    df = df.drop_duplicates(['time'], keep='first', inplace=False, ignore_index=True)
                    try:
                        avg_buy_price = get_avg_buy_price()
                        logger.info(cl["BOLD"] + cl["RED"] + "평균매수단가:%.2f" + cl["RESET"], avg_buy_price)
                    except Exception as ex:
                        exc_type, exc_obj, exc_tb = sys.exc_info()
                        fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
                        logger.error("get_avg_buy_price -> exception! %s : %s %d" % (str(ex) , fname, exc_tb.tb_lineno))

                    ###잔고 조회
                    current_unit = 0.0
                    try:
                        balance = trade.get_balance()
                        price = pyupbit.get_current_price("KRW-" + conf.TRADE_TICKER)
                        logger.info("현재가 : %.4f" % price)
                        limit_cash = (balance[0] * price + balance[1]) * conf.REINVESTMENT_RATE if balance[0] != 0 or balance[1] != 0 else base_balance
                        logger.info("LIMIT_CASH: %.2f", limit_cash)
                        trade.set_limitcash(Const.LIMIT_CASH)
                        rate = ((balance[0]*price + balance[1] - base_balance)*100)/base_balance if base_balance is not None and base_balance != 0 else 0
                        trend_rate = old_rate - rate
                        old_rate = rate
                        if trend_rate == 0:
                            rate_trend += "="
                        elif trend_rate > 0:
                            rate_trend += "V"
                        else:
                            rate_trend += "^"

                        if len(rate_trend) > 5:
                            rate_trend = rate_trend[-5:]

                        logger.info(cl["ITELIC"] + cl["RED"] + "평가금액:%.2f 수익률: %.2f%% %s" + cl["RESET"], balance[0]*price + balance[1], rate, rate_trend)
                        #현재의 원화 잔고 얻기
                        krw = balance[1]
                        current_unit = balance[0]
                        logger.info(cl["BOLD"] + cl["RED"] + "잔고조회: %.5f %.2f" + cl["RESET"], balance[0], balance[1])
                    except Exception as ex:
                        exc_type, exc_obj, exc_tb = sys.exc_info()
                        fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
                        logger.error("get_current_price -> exception! %s : %s %d" % (str(ex) , fname, exc_tb.tb_lineno))

                    logger.info(cl["BOLD"] + cl["GREEN"] + "clsmin %.2f(%s), clsmax:%.2f(%s), close:%.2f(%s)" + cl["RESET"], 
                        df.loc[idx, 'clsmin'], dfInclease(df.index.max(), df, "clsmin"), df.loc[idx, 'clsmax'], 
                        dfInclease(df.index.max(), df, "clsmax"), df.loc[idx, 'close'], dfInclease(df.index.max(), df, "close"))
                    logger.info(cl["BOLD"] + cl["GREEN"] + "min_rate:%.2f%%(%s) %s max_rate:%.2f%%(%s) %s" + cl["RESET"], 
                        df.loc[idx, 'min_rate'], dfInclease(df.index.max(), df, "min_rate"), 
                        dfTrend(df.index.max(), df, "minratedeg"), df.loc[idx, 'max_rate'], 
                        dfInclease(df.index.max(), df, "max_rate"), 
                        dfTrend(df.index.max(), df, "maxratedeg"))
                    logger.info("time date:" + time.strftime("%Y%m%d%H%M%S", time.localtime(idx_time)))
                    logger.info(cl["BOLD"] + cl["GREEN"] + "trade_ok:%s" + cl["RESET"], str(trade_ok))
                    dx = df
                    pred = predict(logger, dx, idx)
                    logger.info(cl["BOLD"] + cl["GREEN"] + "predict action:%s" + cl["RESET"], str(pred.get_dl_action()))
                    if pred.fsell() and current_unit > 0 :
                        logger.info("매도 타이밍: %s ", time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(idx_time)))
                        if trade_ok == True:
                            ret = trade.sell_crypto_currency()
                            if ret is not None:
                                cursor = await sqlcon.cursor(aiomysql.cursors.DictCursor)
                                try:
                                    ret = upbit.get_order("KRW-" + conf.TRADE_TICKER)
                                    while len(ret) != 0 : 
                                        ret = upbit.get_order("KRW-" + conf.TRADE_TICKER)
                                        await asyncio.sleep(0.01)
                                    str_query = """SELECT * 
                                        FROM main.TB_TRADE_HIST 
                                        WHERE user_id='%s' 
                                        AND qty > 0 
                                        AND end_dt = 0""" % user
                                    await cursor.execute(str_query)
                                    data = await cursor.fetchall()
                                    if len(data) > 0:
                                        fatch_result = pd.DataFrame(data)
                                        start_dt = fatch_result['start_dt'][0]
                                        qty = fatch_result['qty'][0]
                                        # logger.info("start_dt:%d", start_dt)
                                        str_query = """
                                        UPDATE main.TB_TRADE_HIST 
                                        SET 
                                            sell_qty = qty
                                            , qty=0
                                            , close=%f 
                                            , sell_sum =%f
                                            , end_dt=%d
                                        WHERE user_id = '%s' AND start_dt = %d""" % (
                                            df.loc[idx, 'close'], df.loc[idx, 'close']*qty, time.time(),  user, start_dt
                                        )
                                        await cursor.execute(str_query)
                                        await sqlcon.commit()
                                except Exception as ex:
                                    exc_type, exc_obj, exc_tb = sys.exc_info()
                                    fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
                                    logger.error("read db error -> exception! " + str(ex) + "` %s: %d", fname, exc_tb.tb_lineno)
                                finally:
                                    await cursor.close()
                    elif pred.fbuy():
                        logger.info("매수 타이밍 %s", time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(idx_time)))
                        if trade_ok == True:
                            ret = trade.buy_crypto_currency()
                            if ret is not None:
                                try:
                                    ret = upbit.get_order("KRW-" + conf.TRADE_TICKER)
                                    while len(ret) != 0 :
                                        ret = upbit.get_order("KRW-" + conf.TRADE_TICKER)
                                        await asyncio.sleep(0.01)
                                    avg_buy_price = get_avg_buy_price()
                                    balance = trade.get_balance()
                                    trdhist_clmns = ['user_id', 'start_dt', 'ticker', 'qty', 'cost', 'cost_sum', 'sell_qty', 'close', 'sell_sum', 'end_dt']
                                    trdhist_data  = [[user, df.loc[idx, 'time'], conf.TRADE_TICKER, balance[0], avg_buy_price, balance[0] * avg_buy_price,
                                                    0, 0, 0, 0]]
                                    trdhist_df = pd.DataFrame.from_records(data=trdhist_data, columns=trdhist_clmns)
                                    await aio_data_update(trdhist_df, sqlcon, "main.TB_TRADE_HIST", "replace")
                                except Exception as ex:
                                    exc_type, exc_obj, exc_tb = sys.exc_info()
                                    fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
                                    logger.error("read db error -> exception! " + str(ex) + "` %s: %d", fname, exc_tb.tb_lineno)
                                finally:
                                    await cursor.close()

                    logger.info(df.tail(5))

                    idx = df.index.max()
                    idx_time = df.loc[idx, 'time']
                    sqlcon.close()
                except Exception as ex:
                    exc_type, exc_obj, exc_tb = sys.exc_info()
                    fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
                    logger.error("`main -> exception! %s ` : %s %d" % (str(ex) , fname, exc_tb.tb_lineno))
                    time.sleep(60)
                finally:
                    end_time = time.time()
                    if (end_time - start_time) > 0 and (end_time - start_time) <= Const.SLEEP_TIME:
                        await asyncio.sleep(Const.SLEEP_TIME - (end_time - start_time))
    except Exception as ex:
        exc_type, exc_obj, exc_tb = sys.exc_info()
        fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
        logger.error("`main -> exception! %s ` : %s %d" % (str(ex) , fname, exc_tb.tb_lineno))
        time.sleep(60)
    finally:
        pool.close()
        await pool.wait_closed()         

async def get_trade_yn(sqlcon):
    global trade_ok
    global base_balance
    user = os.getenv('user')
    
    str_sql = """
        SELECT 
            a.TRADE_YN
            ,a.BASE_BALANCE
        FROM main.TN_PAYMENT_ACNT a
        WHERE a.USER_ID = '%s'
    """ % user
    cursor = await sqlcon.cursor(aiomysql.cursors.DictCursor)
    await cursor.execute(str_sql)
    user_info = pd.DataFrame(await cursor.fetchall())
    await cursor.close()

    trade_yn = np.asarray(user_info)[0][0]
    base_balance = float(np.asarray(user_info)[0][1])
    trade_ok = (trade_yn == 'Y')

def get_env():
    global upbit
    global trade
    global user
    user = os.getenv('user')
    logger.info("user[%s]", user)
    
    sqlcon = pymysql.connect(host='193.xxx.xxx.xx', user='xxxxxx', password='xxxxxxx', db='xxxx')
    str_sql = """
        SELECT 
            a.PUB_KEY
            ,a.APP_KEY
            ,a.SECU_KEY
            ,a.BASE_BALANCE
        FROM TN_PAYMENT_ACNT a
        WHERE a.USER_ID = '%s'
    """ % user
    cursor = sqlcon.cursor(pymysql.cursors.DictCursor)
    cursor.execute(str_sql)
    user_info = pd.DataFrame(cursor.fetchall())
    cursor.close()

    pub_key = np.asarray(user_info)[0][0]
    app_key = np.asarray(user_info)[0][1]
    sec_key = np.asarray(user_info)[0][2]
    base_balance = float(np.asarray(user_info)[0][3])

    # aes = AESCryptoCBC(pub_key)

    enc_app_key = app_key
    enc_sec_key = sec_key

    upbit = pyupbit.Upbit(enc_app_key, enc_sec_key)
    trade  = UpbitTrade(upbit, logger, conf.TRADE_TICKER, base_balance, base_balance * conf.REINVESTMENT_RATE)

if __name__ == '__main__':
    get_env()
    asyncio.run(trade_proc())

import torch
import torch.nn as nn
import torchvision
import torch.nn.functional as F

import plotly.graph_objects as go
import plotly.subplots as ms
import plotly as plt
from PIL import Image
import io
import os.path as path
import talib
import numpy as np

data_size = 100
action_kind = 4
screen_height = 50
screen_width  = 70

class DQN(nn.Module):
    def __init__(self, device, h, w, outputs, qsize):
        super(DQN, self).__init__()
        self.conv1 = nn.Conv2d(4, h*w, kernel_size=5, stride=2)
        self.bn1 = nn.BatchNorm2d(h*w)
        self.conv2 = nn.Conv2d(h*w, qsize, kernel_size=5, stride=2)
        self.bn2 = nn.BatchNorm2d(qsize)
        self.device = device
        self.conv3 = nn.Conv2d(qsize, qsize, kernel_size=5, stride=2)
        self.bn3 = nn.BatchNorm2d(qsize)

        linear_input_size = 3 * qsize
        self.head = nn.Linear(linear_input_size, outputs)

    # Called with either one element to determine next action, or a batch
    # during optimization. Returns tensor([[left0exp,right0exp]...]).
    def forward(self, x):
        x = x.to(self.device)
        x = F.relu(self.bn1(self.conv1(x)))
        # print("x1------->", x.size())
        x = F.relu(self.bn2(self.conv2(x)))
        # print("x2------->", x.size())
        x = F.relu(self.bn3(self.conv3(x)))
        # print("x3------->", x.size())
        return self.head(x.view(x.size(0), -1))

def get_chart(df, idx, max_data:int=300, i_w:int=140, i_h:int=100):
    ndf = df.head(idx).tail(max_data)
    ndf.reset_index(drop=True, inplace=True)
    
    candle = go.Candlestick(x=ndf.index,open=ndf['open'],high=ndf['high'],low=ndf['low'],close=ndf['close'], increasing_line_color = 'red',decreasing_line_color = 'blue', showlegend=False)
    upper = go.Scatter(x=ndf.index, y=ndf['upper'], line=dict(color='red', width=2), name='upper', showlegend=False)
    ma20 = go.Scatter(x=ndf.index, y=ndf['ma20'], line=dict(color='black', width=2), name='ma20', showlegend=False)
    lower = go.Scatter(x=ndf.index, y=ndf['lower'], line=dict(color='blue', width=2), name='lower', showlegend=False)

    volume = go.Bar(x=ndf.index, y=ndf['volume'], marker_color='red', name='volume', showlegend=False)

    MACD = go.Scatter(x=ndf.index, y=ndf['macd'], line=dict(color='blue', width=2), name='MACD', legendgroup='group2', legendgrouptitle_text='MACD')
    MACD_Signal = go.Scatter(x=ndf.index, y=ndf['signal'], line=dict(dash='dashdot', color='green', width=2), name='MACD_Signal')
    MACD_Oscil = go.Bar(x=ndf.index, y=ndf['flag'], marker_color='purple', name='MACD_Oscil')

    fast_k = go.Scatter(x=ndf.index, y=ndf['fast_k'], line=dict(color='skyblue', width=2), name='fast_k', legendgroup='group3', legendgrouptitle_text='%K %D')
    slow_d = go.Scatter(x=ndf.index, y=ndf['slow_d'], line=dict(dash='dashdot', color='black', width=2), name='slow_d')

    PB = go.Scatter(x=ndf.index, y=ndf['PB']*100, line=dict(color='blue', width=2), name='PB', legendgroup='group4', legendgrouptitle_text='PB, MFI')
    MFI10 = go.Scatter(x=ndf.index, y=ndf['MFI10'], line=dict(dash='dashdot', color='green', width=2), name='MFI10')

    RSI = go.Scatter(x=ndf.index, y=ndf['rsi14'], line=dict(color='red', width=2), name='RSI', legendgroup='group5', legendgrouptitle_text='RSI')
    
    # 스타일
    fig = ms.make_subplots(rows=5, cols=2, specs=[[{'rowspan':4},{}],[None,{}],[None,{}],[None,{}],[{},{}]], shared_xaxes=True, horizontal_spacing=0.03, vertical_spacing=0.01)

    fig.add_trace(candle,row=1,col=1)
    fig.add_trace(upper,row=1,col=1)
    fig.add_trace(ma20,row=1,col=1)
    fig.add_trace(lower,row=1,col=1)

    fig.add_trace(volume,row=5,col=1)

    fig.add_trace(candle,row=1,col=2)
    fig.add_trace(upper,row=1,col=2)
    fig.add_trace(ma20,row=1,col=2)
    fig.add_trace(lower,row=1,col=2)

    fig.add_trace(MACD,row=2,col=2)
    fig.add_trace(MACD_Signal,row=2,col=2)
    fig.add_trace(MACD_Oscil,row=2,col=2)

    fig.add_trace(fast_k,row=3,col=2)
    fig.add_trace(slow_d,row=3,col=2)

    fig.add_trace(PB,row=4,col=2)
    fig.add_trace(MFI10,row=4,col=2)

    fig.add_trace(RSI,row=5,col=2)

    # 추세추종
    # trend_fol = 0
    # trend_refol = 0
    # for i in ndf.index:
    #     if ndf['PB'][i] > 0.8 and ndf['MFI10'][i] > 80:
    #         trend_fol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='orange', marker_size=20, marker_symbol='triangle-up', opacity=0.7, showlegend=False)
    #         fig.add_trace(trend_fol,row=1,col=1)
    #     elif ndf['PB'][i] < 0.2 and ndf['MFI10'][i] < 20:
    #         trend_fol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='darkblue', marker_size=20, marker_symbol='triangle-down', opacity=0.7, showlegend=False)
    #         fig.add_trace(trend_fol,row=1,col=1)

    # 역추세추종
    # for i in ndf.index:
    #     if ndf['PB'][i] < 0.05 and ndf['IIP21'][i] > 0:
    #         trend_refol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='purple', marker_size=20, marker_symbol='triangle-up', opacity=0.7, showlegend=False)  #보라
    #         fig.add_trace(trend_refol,row=1,col=1)
    #     elif df['PB'][i] > 0.95 and ndf['IIP21'][i] < 0:
    #         trend_refol = go.Scatter(x=[ndf.index[i]], y=[ndf['close'][i]], marker_color='skyblue', marker_size=20, marker_symbol='triangle-down', opacity=0.7, showlegend=False)  #하늘
    #         fig.add_trace(trend_refol,row=1,col=1)    

    # fig.add_trace(trend_fol,row=1,col=1)
    # 추세추총전략을 통해 캔들차트에 표시합니다.

    # fig.add_trace(trend_refol,row=1,col=1)
    # 역추세 전략을 통해 캔들차트에 표시합니다.
    
    # fig.update_layout(autosize=True, xaxis1_rangeslider_visible=False, xaxis2_rangeslider_visible=False, margin=dict(l=50,r=50,t=50,b=50), template='seaborn', title=f'({ticker})의 날짜: ETH [추세추종전략:오↑파↓] [역추세전략:보↑하↓]')
    # fig.update_xaxes(tickformat='%y년%m월%d일', zeroline=True, zerolinewidth=1, zerolinecolor='black', showgrid=True, gridwidth=2, gridcolor='lightgray', showline=True,linewidth=2, linecolor='black', mirror=True)
    # fig.update_yaxes(tickformat=',d', zeroline=True, zerolinewidth=1, zerolinecolor='black', showgrid=True, gridwidth=2, gridcolor='lightgray',showline=True,linewidth=2, linecolor='black', mirror=True)
    # fig.update_traces(xhoverformat='%y년%m월%d일')
    # size = len(img)
    # img = plt.io.to_image(fig, format='png')
    # canvas = FigureCanvasBase(fig)
    # img = Image.frombytes(mode='RGB', size=(700, 500), data=fig.to_image().to, decoder_name='raw')
    # img = Image.frombytes('RGBA', (700, 500), plt.io.to_image(fig, format='png'), 'raw')
    # img = Image.fromarray('RGBA', (700, 500), np.array(plt.io.to_image(fig, format='png')), 'raw')
    # img = Image.fromarray(np.array(plt.io.to_image(fig, format='png')), 'RGB')
    # img = Image.fromarray(np.array(plt.io.to_image(fig, format='png')), 'L')
    # img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png', width=140, height=100)))
    # img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png', width=700, height=500)))
    img = Image.open(io.BytesIO(plt.io.to_image(fig, format='png')))
    # print(img)
    img.convert("RGB")
    # print(img)
    img.thumbnail((i_h, i_w), Image.ANTIALIAS)
    # print(img)
    # img = Image.Image(img, 'RGB')
    # img.show()
    return img

from lib.addAuxiliaryData import Auxiliary

class predict():
    def __init__(self, logger, df, idx) -> None:
        df['open'] = df['start']
        df['sma3'] = talib.SMA(np.asarray(df['close']), 3)
        aux = Auxiliary(logger)
        aux.add_macd(df,12,26,9)
        aux.add_bbands(df)
        aux.add_relative_strength(df)
        aux.add_stock_cast(df)
        aux.add_iip(df)
        aux.add_mfi(df)
        # df.dropna(inplace=True)
        df.reset_index(drop=True,inplace=True)
        # gpu가 사용 가능한 경우에는 device를 gpu로 설정하고 불가능하면 cpu로 설정합니다.
        device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        converter = torchvision.transforms.ToTensor()
        # 모델을 지정한 장치로 올립니다.
        self.model = DQN(device, screen_height, screen_width, action_kind, 4).to(device)
        # model = CNNRNN(device, screen_height, screen_width, action_kind, 4).to(device)
        # model = CNN().to(device)
        self.model = nn.DataParallel(self.model, device_ids=[0,1]).to(device)

        if path.exists("/home/yubank/pt/train_dqn_{0:02d}_{1}.pt".format(action_kind, device)):
            self.model.load_state_dict(torch.load("/home/yubank/pt/train_dqn_{0:02d}_{1}.pt".format(action_kind, device)))

        self.model.eval()

        self.x = get_chart(df, idx, data_size, i_w = screen_width, i_h = screen_height)
        self.x = converter(self.x).unsqueeze(0).to(device).squeeze(1)

    def get_dl_action(self):
        output = self.model.forward(self.x)
        _,action = torch.max(output,1)
        # print("action:", action)
        return action.cpu().numpy()[0]

    def fbuy(self):
        return True if self.get_dl_action() == 0 else False

    def fsell(self):
        return True if self.get_dl_action() == (action_kind - 1) else False

우여 곡절이 중간에 조금 있었습니다.

학습된 모델을 pt파일로 저장을 한 다음 predict 클래스를 만들었습니다.

파이선의 클래스는 static method와 일반 method의 개념이 없어서 그냥 함수 묶음으로도 사용이 가능합니다.

적용 후 실제 upbit에서 거래가 발생했고 한 차례도 손실없이 매매를 하고 있습니다.

1주일을 일단 집에서 돌리고 고로케이션을 알아보고 있는데 최소 30만원(랙 마운트 케이스 및 파워구매, 대충 4u 로케이션 가격 15만원 설치비 5만원(IDC내부 회선 설치 및 전원공사임 내서버 설치는 별도임))의 추가 비용이 발생합니다.

수익률이 안나온다면 코로케이션을 진행할 수 없을 것 같네요.

맺은말

궁금한 점이나 이상하다고 생각되시면 답글로 남겨 주세요.

세세하고 친절히 답변 드리겠습니다.

LIST

저작자표시 비영리

'python > 자동매매 프로그램' 카테고리의 다른 글

강화학습을 이용한 비트코인 매매프로그램(14) - 학습 파일 단위 분리 (5)	2023.02.25
강화학습을 이용한 비트코인 매매프로그램(13) - 강화학습 최적화 (1)	2023.02.14
강화학습을 이용한 비트코인 매매프로그램(11) - ResNet + RNN 적용 모델 (1)	2022.12.25
강화학습을 이용한 비트코인 매매프로그램(7)-Back Test (1)	2022.12.24
강화학습을 이용한 비트코인 매매프로그램(10) - CNN+RNN 모델 확장 (0)	2022.12.21

python - 3 사칙 연산

아빠는 벌레잡이 2023. 1. 24. 20:29

2023. 1. 24. 20:29

python은 C에 기본을 두지만 --a, a++은 지원하지 않습니다.

그렇지만 a += 1은 지원을 하므로 연속 연산을 사용하고자 하는 경우는 뒤의 경우를 사용하면 됩니다.
기본적인 +, -, *, / 와 같은 사칙 연산 외에도 정수 나눗셈(//)도 지원합니다.
C 나 Java에서는 없는 연산인데 파이썬에는 정수 나눗셈을 별도로 두고 있습니다.

이유는 python이 C와 같은 자료형을 따르고는 있지만

변수를 선언할때 별도의 형 선언을 하지 않아 결과를 예측할 수 없기 때문입니다.

일반적으로 정수 + 소수 = 소수가 되며 정수 - 소수 = 소수가 됩니다.
그렇지만 6 / 2 를 했을 경우 3(정수)일까 아니면 3.0(소수)일까요?

아니 그것 보다는 실질적으로 7/2를 하면 3.5가 되어야 할까 3이 되어야할까요?

(일반적으로 C같은 자료 구조를 가진 경우는

7/2의 경우 자동으로 정수 연산을 실행하고 변수 또한 정수 변수에 대입하기 때문에

자동으로 값이 3.5가 아닌 3이 됩니다.

그래서 본의 아니게 버그를 만드는 경우가 많이 발생 합니다.

하지만 파이썬의 경우 정수 나눗셈과 소수 나눗셈(기본값)을 나누어 가지고 있기 때문에 / 을 사용했다면

무조건 소수가 되고 변수의 형도 소수형으로 변경이 됩니다. -- 주의 필요 --

)
의외로 현장에서는 7/2가 3일 필요가 꽤 있습니다.

그런 경우 python 은 7//2로 정수 나눗셈을 구해야 합니다.
예를 들면 모수가 5로 나누어 떨어지지 않는 경우 오류가 나는 머쉰 런닝 모델이 있다면

전체 모수 t 를 t = (t//5) * 5를 하면 정확하게 5로 나누어 떨어지는 수를 구할 수 있습니다.
정수 나눗셈의 나머지는 7%2로써 답은 1이 됩니다.

이 외에 3항 식 이라고 하는
b = a == 2 ? 3 : 4 의 경우는 C에서의 표현이지만 python은 b = 3 if a == 2 else 4 라고 표현한다.

어떻게 보면 C/C++의 또 다른 방향의 발전체가 파이썬 인 것입니다.

LIST

'python > python 기초문법' 카테고리의 다른 글

Python - 2 (0)	2023.01.24
파이썬 기초 1 - 설치 (0)	2023.01.24

Python - 2

아빠는 벌레잡이 2023. 1. 24. 18:24

2023. 1. 24. 18:24

모든 프로그램이 그러하듯이 파이썬도 왼쪽에 변수 = 을 오른쪽에 값을 배치합니다.

변수에 값을 담는다는 의미입니다.

거의 대부분의 프로그램이 그렇게 운영 됩니다.

>>>a=30
>>>b=40
>>>result = a + b
>>>print("result: %04d" % result)
result: 0070

python shell에 실행을 위의 문장을 실행하면 a=30과 같이 값을 입력하는 과정은 나타나지 않습니다.

그리고 맨 마지막 줄에 print문을 만나면 비로써 출력이 됩니다.
그런데 출력 방법이 이전과 다름니다.
C 언어를 공부하신 분은 금방 이게 글짜를 포맷팅 하는 거라는 것을 알 수 있습니다.
print에 그냥 ',' 만 기술하면 그냥 문자를 연속적으로 출력할 뿐입니다.
python은 객체함수 format을 가집니다.
python은 자바처럼 클래스와 클래스 내부함수를 가지기 때문입니다.
자바가 상속과 피상속을 구현하지 않고 확장의 개념을 가지는 것과 다르게

python은 하위 상속과 피상속의 개념이 있습니다.

파이썬은 자바처럼 정수변수의 값을 바로 문자로 변환해 주기는 하지만 연산 과정에서는 문자 type가 정수 type은 연산이 불가능하다며 에러를 냅니다 .
그것을 보면 파이썬은 자바 같은 고급언어 보다는 C언어에 더 가깝습니다.
언어를 어떻게 사용하느냐 방법에 따라 언어는 인터프린터 언어와 컴파일 언어로 나누어 집니다.
인터프린터 언어는 자바스크립트나 php같은 인터넷 언어가 많습니다.
컴파일 언어는 C,C++,Java 와 같이 서버언어가 많습니다.
파이썬은 약간 중간 개념을 가지고 있습니다.
실행하기 전에 해당 코드 블럭은 컴파일 하듯이 검사해서 오류가 없는 경우 실행하고 오류가 있을 경우 실행하지 않습니다.
그리고 위와 같이 shell기능을 제공합니다.
어디에도 없는 특이한 기능입니다.
OS를 배우신 분들은 OS를 구성하는 것이 기계를 직접 제어하는 kernel과

kernel가 인간에게 인터페이스를 제공하는 shell로 되어 있다는 것을 아실 겁니다.
다른 언어는 다 만들어 진다음 인터넷이나 shell에서 실행 되지만 python은 처음부터 자신의 shell을 가지고 있습니다.
어떻게 보면은 파이썬은 여러 가지 영역을 다 다루는 광범위한 언어라고 할 수 있습니다.

LIST

'python > python 기초문법' 카테고리의 다른 글

python - 3 사칙 연산 (0)	2023.01.24
파이썬 기초 1 - 설치 (0)	2023.01.24

파이썬 기초 1 - 설치

아빠는 벌레잡이 2023. 1. 24. 17:57

2023. 1. 24. 17:57

처음 파이썬을 공부하면 책에는 아나콘다설치 니 venv 설치니해서 많은 내용이 나옵니다.

에디터도 뭐를 사용해야 할지 헷갈리고 합니다.

그 전에 아셔야 할 것은 파이썬은 기존 리눅스 시스템의 대부분의 기능이 파이썬 언어를 사용해서 만들어 지거나 라이브러리로 사용되어 있고 bash같은 shell프로그램 및 xwindows app프로그램도 파이썬으로 제작된 경우가 많습니다.

리눅스의 경우 기본적으로 파이썬이 다 깔려 있기 때문에 기본적인 반복작업을 파이썬으로 만들면 편하게 일을 할 수 있습니다.

윈도우도 맞찬가지 입니다.

파이썬 윈도우 auto기능을 이용하면 엑셀로 수 많은 데이터를 옮기고 계산하는것도 한번에 할 수 있습니다.

단 당장해야 하는 일은 파이썬에 멋진 기능이 있어도 그냥 손으로 하는게 빠를 수 있습니다.

그리지만 계속 반복해서 해야 하는 작업은 미리 파이썬 스크립트를 만들어 두고 사용하면 다른 사람에 비해서 훨씬 빨리 일을 끝낼 수 있을 것입니다.

만약에 윈도우 환경에서 파이썬을 처음 배운다면 그냥 최신버젼의 파이썬을 깔고 에디터는 VSCODE나 기타 가벼운 에디터를 사용해서 배울 수 있습니다.
오늘은 파이썬을 이용하여 간단한 계산을 하는 기능을 보겠습니다. cmd창에 py 또는 python을 입력하고 -V 옵션을 주어 깔려 있는 파이썬 버젼을 체크 합니다.

py -V    or    python -V

파이썬은 파이썬 2와 3이 있습니다.

지금 윈도우는 거의 파이썬 3이 깔려 있지만 리눅스는 파이썬2가 깔려 있는 경우도 많습니다.

파이썬2와 파이썬3은 문법적인 부분이 많이 다릅니다.

파이썬3을 배우는 것이 유리합니다.

아무래도 파이썬2는 전문 기술자들이 사용하던 버젼이라고 보시면 될 것 같습니다.
python을 옵션 없이 실행하면 python shell이 열립니다.

python
Python 3.9.5 ...
Type "help"--
>>>

>>> 이렇게 프롬프트가 나옵니다.

>>> a = 30
>>> b = 40
>>> print("total:", (a + b))
total:70

a=30 은 단순히 a라는 방에 30을 넣는 작업이며 출력은 없습니다.
b=40은 단순히 b라는 방에 40을 넣는 작업이며 출력은 없습니다.
print("total:", a+b)를 만나면 비로서 출력 total:70을 실행합니다.
python의 문법은 사실 C를 기반으로 만들어져서 겉과 다르게 상당히 복잡한 문법이 많습니다.

그렇지만 대부분의 사람들 특히 데이터 사이언스나 빅데이터를 공부하시는 분들은 사용할 일이 없지만 C와 마찬가지로 포인트를 제공합니다.

LIST

'python > python 기초문법' 카테고리의 다른 글

python - 3 사칙 연산 (0)	2023.01.24
Python - 2 (0)	2023.01.24

구글 colab에서 학습하기

아빠는 벌레잡이 2023. 1. 20. 07:38

2023. 1. 20. 07:38

data.dat

2.86MB

https://colab.research.google.com/drive/108pVqpUtkD6Tz-cKtVfH-0_DXHZPFq_p?usp=share_link

cnn_img.ipynb

Colaboratory notebook

colab.research.google.com

https://colab.research.google.com/?hl=ko

Google Colaboratory

colab.research.google.com

위 링크에 접속하면 코랩의 새 노트가 뜹니다. 취소하고 자신이 로컬에서 작성한 노트를 업로드해도 되고 계속해서 새 코드를 작성해도 됩니다.

가장 위의 링크는 제가 작성한 CNNRNN모델이며 COLAB의 세션 타임이 그리 길지 않기 때문에 학습용 데이터로 기동 여부만 확인 후 로컬 시스템이나 azure를 사용하거나 코랩 상위랩 구매가 필요할 수 있습니다. COLAB은 무료이기는 하지만 그래도 쓸만한 하드웨어 사양과 4~5시간의 런타임 세션을 제공해 줍니다.

LIST

저작자표시 비영리

'python > 머쉰러닝' 카테고리의 다른 글

Scikit-Learn 매매데이터 분석(1) (0)	2022.07.03

강화학습을 이용한 비트코인 매매프로그램(11) - ResNet + RNN 적용 모델

아빠는 벌레잡이 2022. 12. 25. 16:38

2022. 12. 25. 16:38

nvidia의 tesla t4 16G의 중고가 약 350만원정도에 cuda 프로세의 갯수는 2560개이다. 오로지 머쉰런닝에만 사용되므로 Rtx 시리즈와는 완전히 다르다고 할 수 있습니다. 무엇보다 가격대인데요. tesla t4모델을 다른 이유로 IDC상면가를 알아보다 GPU서버 임대가를 보는데 보통 한달에 몇백이길래 햐 이건 강화학습 따위로 매매프로그램 만드는 회사는 망하겠다고 생각하고 RTX버젼과 RTX A시리즈와 TESLA버젼이 따로 임대가 되고 있길래 가격을 봤더니 TESLA H100 80G모델이 9천에 몇백 빠지는 8천대 가격이 더라고요. 사실 현재 프로그램의 이미지 원본이 500 X 700이라 이걸 돌린거랑 현재처럼 줄인 버젼의 이미지를 돌렸을 경우 어느정도의 차이가 생길까 하는 의문이 계속되고 있습니다. 사실 CUDA 코어 갯수는 현재 정도면 되고 GDDR이라고 크게 메카니즘이 다른것은 아닐거고 메모리만 늘리면 고해상도가 아니어도 충분이 돌것 같은데 왜 GPU보드는 메모리슬롯을 확장가능하게 만들지 않았을까요? ResNet의 경우는 개발자가 인공신경망을 몇겹까지 쌓을 수 있을까 하는 의문에 만들었다고 합니다.

import torch
import torch.nn as nn
import torch.nn.functional as F

def conv_block_1(in_dim, out_dim, act_fn, stride=1):
    model = nn.Sequential(
        nn.Conv2d(in_dim, out_dim, kernel_size=1, stride=stride),
        act_fn,
    )
    return model

def conv_block_3(in_dim, out_dim, act_fn):
    model = nn.Sequential(
        nn.Conv2d(in_dim, out_dim, kernel_size=3, stride=1, padding=1),
        act_fn,
    )
    return model

class BottleNeck(nn.Module):
    def __init__(self, in_dim, mid_dim, out_dim, act_fn, down=False):
        super(BottleNeck, self).__init__()
        self.act_fn = act_fn
        self.down = down
        
        if self.down:
            self.layer = nn.Sequential(
                conv_block_1(in_dim, mid_dim, act_fn, 2),
                conv_block_3(mid_dim, mid_dim,act_fn),
                conv_block_1(mid_dim, out_dim, act_fn),
            )
            self.downsample = nn.Conv2d(in_dim, out_dim, 1, 2)
        else:
            self.layer = nn.Sequential(
                conv_block_1(in_dim, mid_dim, act_fn),
                conv_block_3(mid_dim, mid_dim,act_fn),
                conv_block_1(mid_dim, out_dim, act_fn),
            )
            self.dim_equalizer = nn.Conv2d(in_dim, out_dim, kernel_size=1)
    
    def forward(self, x):
        if self.down:
            downsample = self.downsample(x)
            out = self.layer(x)
            out = out + downsample
        else:
            out = self.layer(x)
            if x.size() is not out.size():
                x = self.dim_equalizer(x)
            out = out + x
        return out

class ResNet(nn.Module):
    def __init__(self, base_dim, num_classes=2):
        super(ResNet, self).__init__()
        self.act_fn = nn.ReLU()
        self.layer_1 = nn.Sequential(
            nn.Conv2d(3,base_dim,7,2,3),
            nn.ReLU(),
            nn.MaxPool2d(3,2,1),
        )
        
        self.layer_2 = nn.Sequential(
            BottleNeck(base_dim, base_dim, base_dim*4, self.act_fn),
            BottleNeck(base_dim*4, base_dim, base_dim*4, self.act_fn),
            BottleNeck(base_dim*4,base_dim,base_dim*4,self.act_fn, down=False),
        )
        
        self.layer_3 = nn.Sequential(
            BottleNeck(base_dim*4, base_dim*2, base_dim*8, self.act_fn),
            BottleNeck(base_dim*8, base_dim*2, base_dim*8, self.act_fn),
            BottleNeck(base_dim*8, base_dim*2, base_dim*8, self.act_fn),
            BottleNeck(base_dim*8,base_dim*2,base_dim*8,self.act_fn, down=False),
        )

        self.layer_4 = nn.Sequential(
            BottleNeck(base_dim*8, base_dim*4, base_dim*16, self.act_fn),
            BottleNeck(base_dim*16, base_dim*4, base_dim*16, self.act_fn),
            BottleNeck(base_dim*16, base_dim*4, base_dim*16, self.act_fn),
            BottleNeck(base_dim*16, base_dim*4, base_dim*16, self.act_fn),
            BottleNeck(base_dim*16, base_dim*4, base_dim*16, self.act_fn),
            BottleNeck(base_dim*16,base_dim*4,base_dim*16,self.act_fn, down=False),
        )

        self.layer_5 = nn.Sequential(
            BottleNeck(base_dim*16, base_dim*8, base_dim*32, self.act_fn),
            BottleNeck(base_dim*32, base_dim*8, base_dim*32, self.act_fn),
            BottleNeck(base_dim*32, base_dim*8, base_dim*32,self.act_fn, down=False),
        )
        
        self.avgpool = nn.AvgPool2d(7, 1)
        self.fc_layer = nn.Linear(base_dim*3, num_classes)
        
    def forward(self, batch_size, x):
        out = self.layer_1(x)
        out = self.layer_2(out)
        out = self.layer_3(out)
        out = self.layer_4(out)
        out = self.layer_5(out)
        out = self.avgpool(out)
        out = out.view(batch_size, -1)
        out = self.fc_layer(out)
        return out

class ResNetRNN(nn.Module):
    def __init__(self, device, h, w, outputs, hdnsize):
        super(ResNetRNN, self).__init__()
        self.device = device
        self.hidden_size = hdnsize
        self.resnet = ResNet(h*w, hdnsize)
    
        self.i2h = nn.Linear(54 * hdnsize, hdnsize)
        self.h2h = nn.Linear(hdnsize, hdnsize)
        self.i2o = nn.Linear(hdnsize, outputs)
        self.act_fn = nn.Tanh()

    def init_hidden(self):
        return torch.zeros(1, self.hidden_size)
    
    def forward(self, batch_size, x):
        out = self.resnet(batch_size, x)
        hidden = self.init_hidden().to(self.device)
        out= self.i2h(out.view(out.size(0), -1))
        hidden = self.h2h(hidden)
        hidden = self.act_fn(out + hidden)
        return self.i2o(hidden)

위 코드는 resnet과 rnn을 합성한 코드 입니다. resnet은 책 '파이토치 첫걸음'의 소스 내용을 응용했고 rnn은 이전 cnnrnn의 소스에서 참조 했습니다. 파일 이름은 ResNet.py입니다.

import random
from collections import deque, namedtuple
from IPython.display import display, Math
from Account import Account
import math
from itertools import count
import os

import sys
import os.path as path
import plotly as plt
import torch
import torch.nn as nn
import torch.optim as optim
from Market import Market
import torchvision
import time
from ResNet import ResNetRNN
from memory import ReplayMemory, Experience
# from transformers import get_cosine_schedule_with_warmup
import transformers

action_kind = 41
max_episode = 5000
screen_height = 100
screen_width  = 140
data_size = 250
epsilon = 0.3
dis = 0.9

BATCH_SIZE = 8
# GAMMA = 0.999
GAMMA = 0.99
EPS_START = 0.9
EPS_END = 0.05
EPS_DECAY = 200
TARGET_UPDATE = 10
steps_done = 0
loss = any
WINDOW_START = 0
WINDOW_SIZE  = 1500

# for i in range(torch.cuda.device_count()):
#     print(torch.cuda.get_device_name(i))
device_str = "cuda"
device = torch.device(device_str)
    
memory = ReplayMemory(BATCH_SIZE)
converter = torchvision.transforms.ToTensor()
market = Market()
train_net = ResNetRNN(device, screen_height, screen_width, action_kind, 32).to(device)
train_net = nn.DataParallel(train_net, device_ids=[0,1]).to(device)

episode_durations = []
optimizer = optim.RMSprop(train_net.parameters(), lr=0.000001, eps=0.00000001)
# scheduler = get_cosine_schedule_with_warmup(optimizer, 5, base_lr=0.3, final_lr=0.01)
scheduler = transformers.get_cosine_schedule_with_warmup(optimizer, 
                                                         num_warmup_steps=5, 
                                                         num_training_steps=25)
def optimize_action(memory):
    if len(memory) < BATCH_SIZE:
        return None
    return loss

# def select_action(df, idx):
#     try:
#         global steps_done
#         sample = random.random()
#         eps_threshold = EPS_END + (EPS_START - EPS_END) * math.exp(-1. * steps_done / EPS_DECAY)
#         steps_done += 1
#         if sample > eps_threshold:
#             if df.loc[idx, "closemax"] == df.loc[idx, "close"]:
#                 action = 2
#                 return  action
#             elif df.loc[idx, "closemin"] == df.loc[idx, "close"]:
#                 action = 1
#                 return  action
#             else:
#                 return 0
#         else:
#             action = random.randrange(action_kind)
#             return  action
#     except Exception as ex:
#         exc_type, exc_obj, exc_tb = sys.exc_info()
#         fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
#         print("`select_action -> exception! %s : %s %d" % (str(ex) , fname, exc_tb.tb_lineno))
#         return 0
def select_action(df, idx):
    return ((df.loc[idx, "close"] - df.loc[idx, "closemin"]) * (action_kind-1) // (df.loc[idx, "closemax"] - df.loc[idx, "closemin"]))
        
def get_chart(market, idx, max_data):
    img = market.get_chart(idx, max_data=max_data)
    if img is None:
        return None
    # img = Image.fromarray(np.uint8(cm.gist_earth(plt.io.to_image(fig, format='png')*255)))
    # im = Image.fromarray(img, bytes=True)
    # im = Image.fromarray(np.uint8(cm.gist_earth(img))/255)
    # im = Image.fromarray(np.uint8(img)/255)
    # img = img.resize((700, 500), resample=Image.BICUBIC)
    # img = Image.fromarray(cm.gist_earth(plt.io.to_image(fig, format='png'), bytes=True))
    # display(img)
    # chart = converter(img).unsqueeze(0).to(device)
    chart = converter(img).unsqueeze(0)
    return chart

def plot_durations(last_chart, curr_chart):
    plt.figure()
    # plt.subplot(1,2,1)
    img = plt.imshow(last_chart.cpu().squeeze(0).permute(1, 2, 0).numpy(), interpolation='none')
    plt.title('Example extracted screen')
    plt.figure(2)
    # plt.subplot(1,2,2)
    plt.clf()
    durations_t = torch.tensor(episode_durations, dtype=torch.float)
    plt.title('Training...')
    plt.xlabel('Episode')
    plt.ylabel('Duration')
    plt.plot(durations_t.numpy())
    # plt.show()
    # Take 100 episode averages and plot them too
    if len(durations_t) >= 100:
        means = durations_t.unfold(0, 100, 1).mean(1).view(-1)
        means = torch.cat((torch.zeros(99), means))
        plt.plot(means.numpy())

    img.set_data(curr_chart.cpu().squeeze(0).permute(1, 2, 0).numpy())
    plt.pause(0.01)  # pause a bit so that plots are updated
    display.clear_output(wait=True)
    display.display(plt.gcf())
            
def main():
    if path.exists("pt/train_resnetrnn_{}.pt".format(device_str)):
        train_net.load_state_dict(torch.load("pt/train_resnetrnn_{}.pt".format(device_str)))
        
    for _ in range(10):    
        df = market.get_data()
        # df = df.head(WINDOW_SIZE)

        for epoch in range(max_episode):
            account = Account(df, 50000000)
            account.reset()
            for idx,_ in enumerate(df.index, start=data_size):
                try:
                    since = time.time()
                    curr_chart = get_chart(market, idx, data_size)
                    reward = 0
                    num_action = select_action(df, idx)
                    reward, real_action = account.exec_action(2 if num_action == (action_kind - 1) else (1 if num_action == 0 else 0) , idx)
                    print("idx:%d==>action:%d,real_action==>%d price:%.2f"%(idx, num_action, real_action, df.loc[idx, 'close']))
                    # reward = torch.tensor([reward], device=device)
                    action = torch.tensor([[num_action]], device=device, dtype=torch.int64)
                                    
                    memory.push(curr_chart, action, reward)
                    if len(memory) >= BATCH_SIZE:
                        epsode = memory.pop(BATCH_SIZE)
                        batch = Experience(*zip(*epsode))
                        state_batch = torch.cat(batch.state)
                        action_batch = torch.cat(batch.action)
                        train_net.train()
                        state_action_values = train_net(state_batch).gather(1, action_batch)
                        # optimizer = optim.RMSprop(train_net.parameters(), 0.01)
                        
                        criterion = nn.SmoothL1Loss()
                        loss = criterion(state_action_values, action_batch)

                        # Optimize the model
                        # optimizer.zero_grad()
                        optimizer.zero_grad()
                        loss.backward()
                        for param in train_net.parameters():
                            param.grad.data.clamp_(-1, 1)
                        optimizer.step()
                        if loss is not None:
                            print("epoch[%d:%d] epsode is next loss[%.10f]" % (epoch, idx, loss.item()))

                    if idx % TARGET_UPDATE == 0:
                        torch.save(train_net.state_dict(),"pt/train_resnetrnn_{}.pt".format(device_str))
                    
                    spend = time.time() - since
                    print("idx:%d price [%.4f] unit[%.4f] used time[%.2f] agent rate:%.05f remind money:%.02f" 
                            % (idx, df.loc[idx, 'close'], account.unit, spend, account.rate, account.balance + account.unit * df.loc[idx, 'close']))
                    if account.is_bankrupt():
                        break
                    if idx == df.index.max():
                        break
                except Exception as ex:
                    exc_type, exc_obj, exc_tb = sys.exc_info()
                    fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
                    print("`recall_training -> exception! %s : %s %d" % (str(ex) , fname, exc_tb.tb_lineno))
            scheduler.step()

        print("end training DQN")
    
    print('Complete Training')

if __name__ == "__main__":
    main()

위는 ResNet.py의 ResNetRNN클래스를 활용하여 학습을 진행하는 main_resnetrnn.py입니다.

글을 마치며

훈련이 잘 되어 거의 0에 수렴한 결과를 가지고 백테스팅을 진행해 보니 매수 자체를 안하는 결과가 나왔습니다 초반의 학습률이 굉장히 낮을 때보다 결과가 좋지 않았습니다. 그래서 지금은 전체를 여러 단계로 나누고 그 중 (결과값은 인덱스 이므로 언제나 0부터 액션의 가지수 -1 만큼 발생 합니다.) 0과 가장 큰값을 매수와 매도로 설정 나머지는 매매를 안하고 관망 하게 소스를 수정 했습니다. 문제는 액션 수를 늘리니 이번에는 학습률이 굉장히 떨어지는 문제가 발생합니다. 이제 모델은 거의 다 만든것 같습니다 .지금부터는 개인의 삽질만이 방법이 겠죠?

LIST

'python > 자동매매 프로그램' 카테고리의 다른 글

강화학습을 이용한 비트코인 매매프로그램(13) - 강화학습 최적화 (1)	2023.02.14
강화학습을 이용한 비트코인 매매프로그램(12) - 실거래 적용 (5)	2023.01.30
강화학습을 이용한 비트코인 매매프로그램(7)-Back Test (1)	2022.12.24
강화학습을 이용한 비트코인 매매프로그램(10) - CNN+RNN 모델 확장 (0)	2022.12.21
강화학습을 이용한 비트코인 매매프로그램(9) - CNN+RNN 모델 (0)	2022.12.20

PREV 이전 1 2 3 NEXT 다음

오늘도 아빠는 벌레잡는 중

python

DQN network 수정

1.머리말

2.H/W의 변경

'python > 자동매매 프로그램' 카테고리의 다른 글

Colab에서 살아남기

'python > 자동매매 프로그램' 카테고리의 다른 글

강화학습을 이용한 비트코인 매매프로그램(14) - 학습 파일 단위 분리

'python > 자동매매 프로그램' 카테고리의 다른 글

강화학습을 이용한 비트코인 매매프로그램(13) - 강화학습 최적화

'python > 자동매매 프로그램' 카테고리의 다른 글

강화학습을 이용한 비트코인 매매프로그램(12) - 실거래 적용

'python > 자동매매 프로그램' 카테고리의 다른 글

python - 3 사칙 연산

'python > python 기초문법' 카테고리의 다른 글

Python - 2

'python > python 기초문법' 카테고리의 다른 글

파이썬 기초 1 - 설치

'python > python 기초문법' 카테고리의 다른 글

구글 colab에서 학습하기

'python > 머쉰러닝' 카테고리의 다른 글

강화학습을 이용한 비트코인 매매프로그램(11) - ResNet + RNN 적용 모델

'python > 자동매매 프로그램' 카테고리의 다른 글

+ Recent posts

티스토리툴바