什么是卷积神经网络

卷积神经网络在图片和语言识别上能给出优秀的结果,近些年被广泛传播和应用。卷积层也叫过滤器,就像上面放置的小灯。(卷积核,滤波器)

我们需要分开来理解:

  • 卷积:我们不对像素进行处理,而是对一小块一小块进行处理,加强了图片信息的连续性,使得神经网络能看到一个图形而非一个点。
  • 神经网络:激活函数

多次卷积得到分类

在这里插入图片描述

这是一个最基本的搭建流程

在这里插入图片描述

CNN进行手写数字识别

老样子,识别手写数字图片

import torch
import torch.nn as nn
import torch.utils.data as Data
import torchvision      # 数据库模块
import matplotlib.pyplot as plt
import matplotlib;matplotlib.use('TKAgg')

torch.manual_seed(1)    # reproducible

# Hyper Parameters
EPOCH = 1           # 训练整批数据多少次, 为了节约时间, 我们只训练一次
BATCH_SIZE = 50
LR = 0.001          # 学习率


# Mnist 手写数字
train_data = torchvision.datasets.MNIST(
    root='./mnist/',    # 保存或者提取位置
    train=True,  # this is training data
    transform=torchvision.transforms.ToTensor(),    # 转换 PIL.Image or numpy.ndarray 成
                                                    # torch.FloatTensor (C x H x W), 训练的时候 normalize 成 [0.0, 1.0] 区间
    download=True,# 没下载就下载, 下载了就不会再下了
)


test_data = torchvision.datasets.MNIST(root='./mnist/', train=False)

# 批训练 50samples, 1 channel, 28x28 (50, 1, 28, 28)
train_loader = Data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)
# 为了节约时间, 我们测试时只测试前2000个
test_x = torch.unsqueeze(test_data.test_data, dim=1).type(torch.FloatTensor)[:2000]/255.   # shape from (2000, 28, 28) to (2000, 1, 28, 28), value in range(0,1)
test_y = test_data.test_labels[:2000]

class CNN(nn.Module):
    def __init__(self):
        super(CNN,self).__init__()
        self.conv1 = nn.Sequential(  # 搭建卷积网络
            nn.Conv2d(  # shape(1,28,28)
                in_channels=1,  # input height
                out_channels=16,  # n_filters
                kernel_size=5,  # filter的size
                stride=1,  # filter movement的step
                padding=2,  # 如果想要 con2d 出来的图片长宽没有变化, padding=(kernel_size-1)/2 当 stride=1
            ),  # 卷积层(过滤器) output shape(16,28,28)
            nn.ReLU(),  # 激活函数 output shape(16,28,28)
            nn.MaxPool2d(kernel_size=2),  # 在 2x2 空间里向下采样,得到最大值作为特征 output shape (16, 14, 14).因为取得是2个像素作为单位的嘛
        )
        self.conv2 = nn.Sequential(  # input shape (16, 14, 14)
            nn.Conv2d(16, 32, 5, 1, 2),  # output shape (32, 14, 14)
            nn.ReLU(),  # activation
            nn.MaxPool2d(2),  # output shape (32, 7, 7)
        )
        # (32,7,7)是从上面来的
        self.out = nn.Linear(32 * 7 * 7, 10)  # fully connected layer, output 10 classes

    def forward(self,x):
        x = self.conv1(x)
        x = self.conv2(x)  # (batch,31,7,7)
        x = x.view(x.size()[0],-1)  # 三维数据展平  (batch,32*7*7)
        output = self.out(x)
        return output

cnn = CNN()
optimizer = torch.optim.Adam(cnn.parameters(), lr=LR)   # optimize all cnn parameters
loss_func = nn.CrossEntropyLoss()   # CrossEntropyLoss自带Softmax分类

# training and testing
for epoch in range(EPOCH):
    for step, (b_x, b_y) in enumerate(train_loader):   # 分配 batch data, normalize x when iterate train_loader
        output = cnn(b_x)               # cnn output
        loss = loss_func(output, b_y)   # cross entropy loss
        optimizer.zero_grad()           # clear gradients for this training step
        loss.backward()                 # backpropagation, compute gradients
        optimizer.step()                # apply gradients

        # 每50次就用模型做做测试
        if step%50 ==0:
            test_out = cnn(test_x)
            pred_y = torch.max(test_out,1)[1].data.squeeze()
            accuracy = torch.div(sum(pred_y == test_y).type(torch.FloatTensor), float(test_y.size(0)))
            print('Epoch:',epoch,'| train loss:%.4f'%loss.item(),'| test accuracy:%.4f',accuracy.item())
torch.save(cnn,'训练好的model/手写数字识别模型.pkl')
# 最后测试一下
test_output = cnn(test_x[:10])
pred_y = torch.max(test_output, 1)[1].data.numpy().squeeze()
print(pred_y, 'prediction number')
print(test_y[:10].numpy(), 'real number')

什么是RNN

就是从头开始分析一句话,逐字判断代表的是什么含义, 每次分析一个字都会产生一种记忆,分析下一个字会使用前一个字的记忆,直到分析完最后一个字。

在这里插入图片描述

在这里插入图片描述

RNN原理类似,从头开始,RNN会产生S(t),传递给下一个字,S(t)与S(t+1)共同作用得到Y(t+1)
在这里插入图片描述
RNN的形式很自由,例如判断性感倾向,我们只需要得到最后一个节点输出的信息即可
在这里插入图片描述
对于图片描述的RNN,我们只需要一个输入即可,然后让他生成一段话。
在这里插入图片描述
RNN可以做好多好玩的事,例如写学术论文/程序脚本/描述照片/作曲等等

LSTM双向RNN

LSTM(Long Short-Term Memory)长短期记忆,是当下最流行的RNN形式之一。

普通RNN有什么弊端呢?

在学习一句话的过程中,从头到尾学完了,得到误差,然后会反向传递,如果误差乘以的参数W是一个小于1的数,往前传的误差就会越来越小,从而导致最后得到的误差为0,这就叫梯度消失(梯度离散),与此对应的还有梯度爆炸,这就使RNN无法记忆。

在这里插入图片描述
LSTM比普通RNN多了三个控制器,如下,输出的时候会根据主线和支线综合判断,如果分线将主线的意思改变了,就会选择忘记,然后转化为适当比例替换为新剧情。分线剧情就相当于短期记忆
在这里插入图片描述

RNN实现分类

import torch
from torch import nn

torch.manual_seed(1)    # reproducible

# Hyper Parameters
EPOCH = 1           # 训练整批数据多少次, 为了节约时间, 我们只训练一次
BATCH_SIZE = 64
TIME_STEP = 28      # rnn 时间步数 / 图片高度
INPUT_SIZE = 28     # rnn 每步输入值 / 图片每行像素
LR = 0.01           # learning rate


# Mnist 手写数字
train_data = torchvision.datasets.MNIST(
    root='./mnist/',    # 保存或者提取位置
    train=True,  # this is training data
    transform=torchvision.transforms.ToTensor(),    # 转换 PIL.Image or numpy.ndarray 成
                                                    # torch.FloatTensor (C x H x W), 训练的时候 normalize 成 [0.0, 1.0] 区间
    download=True,          # 没下载就下载, 下载了就不用再下了
)


test_data = torchvision.datasets.MNIST(root='./mnist/', train=False)

# 批训练 50samples, 1 channel, 28x28 (50, 1, 28, 28)
train_loader = Data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)

# 为了节约时间, 我们测试时只测试前2000个
test_x = torch.unsqueeze(test_data.test_data, dim=1).type(torch.FloatTensor)[:2000]/255.   # shape from (2000, 28, 28) to (2000, 1, 28, 28), value in range(0,1)
test_y = test_data.test_labels[:2000]


class RNN(nn.Module):
    def __init__(self):
        super(RNN, self).__init__()

        self.rnn = nn.LSTM(     # LSTM 效果要比 nn.RNN() 好多了
            input_size=28,      # 图片每行的数据像素点
            hidden_size=64,     # rnn hidden unit
            num_layers=1,       # 有几层,越多的话效果越好,不过时间越长 RNN layers
            batch_first=True,   # input & output 会是以 batch size 为第一维度的特征集 e.g. (batch, time_step, input_size)
        )

        self.out = nn.Linear(64, 10)    # 输出层

    def forward(self, x):
        # x shape (batch, time_step, input_size)
        # r_out shape (batch, time_step, output_size)
        # h_n shape (n_layers, batch, hidden_size)   LSTM 有两个 hidden states, h_n 是分线, h_c 是主线
        # h_c shape (n_layers, batch, hidden_size)
        r_out, (h_n, h_c) = self.rnn(x, None)   # None 表示 hidden state 会用全0的 state。我们第一层没有记忆,所以写None

        # 选取最后一个时间点的 r_out 输出
        # 这里 r_out[:, -1, :] 的值也是 h_n 的值。(batch, time_step, input)
        out = self.out(r_out[:, -1, :])
        return out

rnn = RNN()
optimizer = torch.optime.Adam(rnn.parameters(),lr=LR)
loss_func = nn.CrossEntropyLoss()

for epoch in range(EPOCH):
    for step, (x, b_y) in enumerate(train_loader):   # gives batch data
        b_x = x.view(-1, 28, 28)   # reshape x to (batch, time_step, input_size)

        output = rnn(b_x)               # rnn output
        loss = loss_func(output, b_y)   # cross entropy loss
        optimizer.zero_grad()           # clear gradients for this training step
        loss.backward()                 # backpropagation, compute gradients
        optimizer.step()                # apply gradients

test_output = rnn(test_x[:10].view(-1, 28, 28))
pred_y = torch.max(test_output, 1)[1].data.numpy().squeeze()
print(pred_y, 'prediction number')
print(test_y[:10], 'real number')

RNN回归

上面使用RNN做了个分类器,我们还没真正用到RNN,因为它在每个时刻都是可以输出的,我们现在做一个回归,对每个时间点做一个输出,比较每个时间点和真实值的差别。
我们用sin曲线,来预测一下cos曲线

import torch
from torch import nn
import numpy as np
import matplotlib.pyplot as plt

# torch.manual_seed(1)    # reproducible

# Hyper Parameters
TIME_STEP = 10      # rnn time step
INPUT_SIZE = 1      # rnn input size
LR = 0.02           # learning rate

# show data
steps = np.linspace(0, np.pi*2, 100, dtype=np.float32)  # float32 for converting torch FloatTensor
x_np = np.sin(steps)
y_np = np.cos(steps)
plt.plot(steps, y_np, 'r-', label='target (cos)')
plt.plot(steps, x_np, 'b-', label='input (sin)')
plt.legend(loc='best')
plt.show()


class RNN(nn.Module):
    def __init__(self):
        super(RNN, self).__init__()

        self.rnn = nn.RNN(
            input_size=INPUT_SIZE,
            hidden_size=32,     # rnn hidden unit
            num_layers=1,       # number of rnn layer
            batch_first=True,   # input & output will has batch size as 1s dimension. e.g. (batch, time_step, input_size)
        )
        self.out = nn.Linear(32, 1)

    def forward(self, x, h_state):
        # x (batch, time_step, input_size)
        # h_state (n_layers, batch, hidden_size)
        # r_out (batch, time_step, hidden_size)
        r_out, h_state = self.rnn(x, h_state)

        outs = []    # save all predictions
        for time_step in range(r_out.size(1)):    # calculate output for each time step
            outs.append(self.out(r_out[:, time_step, :]))
        return torch.stack(outs, dim=1), h_state

        # instead, for simplicity, you can replace above codes by follows
        # r_out = r_out.view(-1, 32)
        # outs = self.out(r_out)
        # outs = outs.view(-1, TIME_STEP, 1)
        # return outs, h_state
        
        # or even simpler, since nn.Linear can accept inputs of any dimension 
        # and returns outputs with same dimension except for the last
        # outs = self.out(r_out)
        # return outs

rnn = RNN()
print(rnn)

optimizer = torch.optim.Adam(rnn.parameters(), lr=LR)   # optimize all cnn parameters
loss_func = nn.MSELoss()

h_state = None      # for initial hidden state

plt.figure(1, figsize=(12, 5))
plt.ion()           # continuously plot

for step in range(100):
    start, end = step * np.pi, (step+1)*np.pi   # time range
    # use sin predicts cos
    steps = np.linspace(start, end, TIME_STEP, dtype=np.float32, endpoint=False)  # float32 for converting torch FloatTensor
    x_np = np.sin(steps)
    y_np = np.cos(steps)

    x = torch.from_numpy(x_np[np.newaxis, :, np.newaxis])    # shape (batch, time_step, input_size)
    y = torch.from_numpy(y_np[np.newaxis, :, np.newaxis])

    prediction, h_state = rnn(x, h_state)   # rnn output
    # !! next step is important !!
    h_state = h_state.data        # repack the hidden state, break the connection from last iteration

    loss = loss_func(prediction, y)         # calculate loss
    optimizer.zero_grad()                   # clear gradients for this training step
    loss.backward()                         # backpropagation, compute gradients
    optimizer.step()                        # apply gradients

    # plotting
    plt.plot(steps, y_np.flatten(), 'r-')
    plt.plot(steps, prediction.data.numpy().flatten(), 'b-')
    plt.draw(); plt.pause(0.05)

plt.ioff()
plt.show()

自编码(AutoEncoder):处理非监督学习

大概过程就是给图片打码,然后再还原
在这里插入图片描述
将一个大数据(例如高清图片)进行压缩,然后学习的时候再解压,完成这个过程的就是自编码。

在这里插入图片描述
自编码也是需要训练的,将白色X压缩,然后解压成黑色X,通过对比,计算误差来反向传递,提高张自编码的准确性,训练好的自编码就是中间那个点。

在这里插入图片描述
我们使用自编码的时候通常只使用前半部分,称为Encoder,他能得到原数据的精髓,然后我们只需要再创建一个小的神经网络学习这个精髓部分即可。不仅减少了神经网络的负担,而且也能达到很好的结果。

自编码在提取主要特征时与PCA(主成分分析)一样,甚至超越了它,即自编码可以给特征属性降维。

自编码实现

import torch
import torch.nn as nn
import torch.utils.data as Data
import torchvision
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib;matplotlib.use('TKAgg')  # 不加这一行的话pycharm不显示动态图
from matplotlib import cm
import numpy as np


# torch.manual_seed(1)    # reproducible

# Hyper Parameters
EPOCH = 10
BATCH_SIZE = 64
LR = 0.005         # learning rate
DOWNLOAD_MNIST = False
N_TEST_IMG = 5

# Mnist digits dataset
train_data = torchvision.datasets.MNIST(
    root='./mnist/',
    train=True,                                     # this is training data
    transform=torchvision.transforms.ToTensor(),    # Converts a PIL.Image or numpy.ndarray to
                                                    # torch.FloatTensor of shape (C x H x W) and normalize in the range [0.0, 1.0]
    download=DOWNLOAD_MNIST,                        # download it if you don't have it
)

# plot one example
print(train_data.train_data.size())     # (60000, 28, 28)
print(train_data.train_labels.size())   # (60000)
plt.imshow(train_data.train_data[2].numpy(), cmap='gray')
plt.title('%i' % train_data.train_labels[2])
plt.show()

# Data Loader for easy mini-batch return in training, the image batch shape will be (50, 1, 28, 28)
train_loader = Data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)


class AutoEncoder(nn.Module):
    def __init__(self):
        super(AutoEncoder, self).__init__()

        self.encoder = nn.Sequential(
            nn.Linear(28*28, 128),
            nn.Tanh(),
            nn.Linear(128, 64),
            nn.Tanh(),
            nn.Linear(64, 12),
            nn.Tanh(),
            nn.Linear(12, 3),   # compress to 3 features which can be visualized in plt
        )
        self.decoder = nn.Sequential(
            nn.Linear(3, 12),
            nn.Tanh(),
            nn.Linear(12, 64),
            nn.Tanh(),
            nn.Linear(64, 128),
            nn.Tanh(),
            nn.Linear(128, 28*28),
            nn.Sigmoid(),       # compress to a range (0, 1)
        )

    def forward(self, x):
        encoded = self.encoder(x)
        decoded = self.decoder(encoded)
        return encoded, decoded


autoencoder = AutoEncoder()

optimizer = torch.optim.Adam(autoencoder.parameters(), lr=LR)
loss_func = nn.MSELoss()

# initialize figure
f, a = plt.subplots(2, N_TEST_IMG, figsize=(5, 2))
plt.ion()   # continuously plot

# original data (first row) for viewing
view_data = train_data.train_data[:N_TEST_IMG].view(-1, 28*28).type(torch.FloatTensor)/255.
for i in range(N_TEST_IMG):
    a[0][i].imshow(np.reshape(view_data.data.numpy()[i], (28, 28)), cmap='gray'); a[0][i].set_xticks(()); a[0][i].set_yticks(())

for epoch in range(EPOCH):
    for step, (x, b_label) in enumerate(train_loader):
        b_x = x.view(-1, 28*28)   # batch x, shape (batch, 28*28)
        b_y = x.view(-1, 28*28)   # batch y, shape (batch, 28*28)

        encoded, decoded = autoencoder(b_x)

        loss = loss_func(decoded, b_y)      # mean square error
        optimizer.zero_grad()               # clear gradients for this training step
        loss.backward()                     # backpropagation, compute gradients
        optimizer.step()                    # apply gradients

        if step % 100 == 0:
            print('Epoch: ', epoch, '| train loss: %.4f' % loss.data.numpy())

            # plotting decoded image (second row)
            _, decoded_data = autoencoder(view_data)
            for i in range(N_TEST_IMG):
                a[1][i].clear()
                a[1][i].imshow(np.reshape(decoded_data.data.numpy()[i], (28, 28)), cmap='gray')
                a[1][i].set_xticks(()); a[1][i].set_yticks(())
            plt.draw(); plt.pause(0.05)

plt.ioff()
plt.show()

# visualize in 3D plot
view_data = train_data.train_data[:200].view(-1, 28*28).type(torch.FloatTensor)/255.
encoded_data, _ = autoencoder(view_data)
fig = plt.figure(2); ax = Axes3D(fig)
X, Y, Z = encoded_data.data[:, 0].numpy(), encoded_data.data[:, 1].numpy(), encoded_data.data[:, 2].numpy()
values = train_data.train_labels[:200].numpy()
for x, y, z, s in zip(X, Y, Z, values):
    c = cm.rainbow(int(255*s/9)); ax.text(x, y, z, s, backgroundcolor=c)
ax.set_xlim(X.min(), X.max()); ax.set_ylim(Y.min(), Y.max()); ax.set_zlim(Z.min(), Z.max())
plt.show()

dqn(强化学习)

alpha-go就是使用的强化学习方法。

我们如果把每一种状态所对应的采取行为放在一个表格中,内存就爆炸了,而且搜索很耗时。我们可以创建一个神经网络,把状态和动作输入进去,输出解决措施或者几种解决措施,就像人一样,通过输入来得到结果。
在这里插入图片描述
我们的实现过程如下
在这里插入图片描述
dqn能够有思维的主要贡献者有两个:experience replay和fixed q-targets

在这里插入图片描述

dqn强化学习实现代码

import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import gym

# Hyper Parameters
BATCH_SIZE = 32
LR = 0.01                   # learning rate
EPSILON = 0.9               # greedy policy
GAMMA = 0.9                 # reward discount
TARGET_REPLACE_ITER = 100   # target update frequency
MEMORY_CAPACITY = 2000
env = gym.make('CartPole-v0')
env = env.unwrapped
N_ACTIONS = env.action_space.n
N_STATES = env.observation_space.shape[0]
ENV_A_SHAPE = 0 if isinstance(env.action_space.sample(), int) else env.action_space.sample().shape     # to confirm the shape


class Net(nn.Module):
    def __init__(self, ):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(N_STATES, 50)
        self.fc1.weight.data.normal_(0, 0.1)   # 为了更好的效果,随机参数生成初始的值
        self.out = nn.Linear(50, N_ACTIONS)
        self.out.weight.data.normal_(0, 0.1)   # initialization

    def forward(self, x):
        x = self.fc1(x)
        x = F.relu(x)
        actions_value = self.out(x)
        return actions_value


class DQN(object):
    def __init__(self):
        self.eval_net, self.target_net = Net(), Net()

        self.learn_step_counter = 0  # 学习了多少步了
        self.memory_counter = 0  # 做了多少次记忆
        self.memory = np.zeros((MEMORY_CAPACITY, N_STATES * 2 + 2))  # 初始化记忆库,全为0。MEMORY_CAPACITY表示有几行,就是可以有多少次记忆,每次记忆4个数值
        self.optimizer = torch.optim.Adam(self.eval_net.parameters(), lr=LR)
        self.loss_func = nn.MSELoss()

    def choose_action(self, x):  # 它的动作
        x = torch.unsqueeze(torch.FloatTensor(x), 0)  # x是我们的观测值
        # input only one sample
        if np.random.uniform() < EPSILON:   # 如果我们采取的概率小于随机数,选取高一点神经网络输出的action_value。90%概率按照以往经验选取最优解,10%概率去探索其他解。greedy
            actions_value = self.eval_net.forward(x)  # 得到action_value
            action = torch.max(actions_value, 1)[1].data.numpy()  # 选取最大的价值
            action = action[0] if ENV_A_SHAPE == 0 else action.reshape(ENV_A_SHAPE)  # return the argmax index
        else:   # 不是上面的情况就随机选取一个动作
            action = np.random.randint(0, N_ACTIONS)
            action = action if ENV_A_SHAPE == 0 else action.reshape(ENV_A_SHAPE)
        return action

    def store_transition(self, s, a, r, s_):  # 记忆库,存储记忆,学习过程就是在记忆库去提取
        transition = np.hstack((s, [a, r], s_))
        # replace the old memory with new memory。如果记忆上限了,就重新索引,覆盖老的记忆
        index = self.memory_counter % MEMORY_CAPACITY
        self.memory[index, :] = transition
        self.memory_counter += 1

    def learn(self):  # 强化学习的方法
        # target parameter update。q现实网络target多少次更新一下
        if self.learn_step_counter % TARGET_REPLACE_ITER == 0:
            self.target_net.load_state_dict(self.eval_net.state_dict())
        self.learn_step_counter += 1

        # target net时不时更新一下,但是eval_net每一步都更新
        # sample batch transitions # 从记忆库随机抽取一些记忆
        sample_index = np.random.choice(MEMORY_CAPACITY, BATCH_SIZE)
        b_memory = self.memory[sample_index, :] 
        b_s = torch.FloatTensor(b_memory[:, :N_STATES])
        b_a = torch.LongTensor(b_memory[:, N_STATES:N_STATES+1].astype(int))
        b_r = torch.FloatTensor(b_memory[:, N_STATES+1:N_STATES+2])
        b_s_ = torch.FloatTensor(b_memory[:, -N_STATES:])

        # q_eval w.r.t the action in experience
        q_eval = self.eval_net(b_s).gather(1, b_a)  # shape (batch, 1)。选取向左(右)走的价值
        q_next = self.target_net(b_s_).detach()     # detach from graph, don't backpropagate。不反向传递更新,因为q_target的更新在上面
        q_target = b_r + GAMMA * q_next.max(1)[0].view(BATCH_SIZE, 1)   # shape (batch, 1)。获得的奖励,参数*下一步的价值。做到未来价值的递减。max返回第一个最大值,第二个是索引
        loss = self.loss_func(q_eval, q_target)  # q_evel是预测值   q_target是真实值

        self.optimizer.zero_grad()
        loss.backward()
        self.optimizer.step()

dqn = DQN()


# 强化学习的一个训练过程
print('\nCollecting experience...')
for i_episode in range(400):
    s = env.reset()  # 和环境互动得到的反馈,s代替state
    ep_r = 0
    while True:
        env.render()  # 环境渲染一下
        a = dqn.choose_action(s)  # dqn根据现在的状态采取行为
        s_, r, done, info = env.step(a)  # 环境根据采取的行为给的反馈

        # modify the reward。使用原来的奖励有点难学,自己重写,让杆子放在中间奖励最大        x, x_dot, theta, theta_dot = s_
        r1 = (env.x_threshold - abs(x)) / env.x_threshold - 0.8
        r2 = (env.theta_threshold_radians - abs(theta)) / env.theta_threshold_radians - 0.5
        r = r1 + r2

        dqn.store_transition(s, a, r, s_)  # 存储现在的状态/动作/此时的奖励/环境导引我去下一个状态

        ep_r += r
        if dqn.memory_counter > MEMORY_CAPACITY:
            dqn.learn()  # dqn学习
            if done:
                print('Ep: ', i_episode,
                      '| Ep_r: ', round(ep_r, 2))

        if done:  # 回合结束就进入下一状态
            break
        s = s_  # 现在的状态给到下一回合

gan对抗网络

凭空生成网络,即生成,比如通过随机数来生成作品。

在这里插入图片描述
开始有一个画家,有灵感但是画的好不好不知道。
然后有了一个鉴赏家,但是他也不太会鉴赏,屏幕前的你教会鉴赏家鉴赏,鉴赏家一边学习,一边告诉画家怎么画画,这就是对抗网络了在这里插入图片描述

在这里插入图片描述
generator会根据随机数生成有意义的数,discriminator会学习判断哪些是真实数据,哪些是生成数据,然后将学习到的经验反向传递给generator,让他更够根据随机数生成更像真实数据的数据

在这里插入图片描述
现在的问题:为什么不直接让画家学习著名画家,还要经过一个鉴赏家?
答:其实gan这是一个学习过程了,鉴赏家的存在只是为了传递参数让画家来学习的,可以理解为鉴赏家自我学习中参数获取的部分。

gan实现代码

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
import matplotlib;matplotlib.use('tkagg')

# torch.manual_seed(1)    # reproducible
# np.random.seed(1)

# Hyper Parameters
BATCH_SIZE = 64
LR_G = 0.0001           # learning rate for generator
LR_D = 0.0001           # learning rate for discriminator
N_IDEAS = 5             # think of this as number of ideas for generating an art work (Generator)。generator的随机想法
ART_COMPONENTS = 15     # 随机生成15个点,然后连成了线段
PAINT_POINTS = np.vstack([np.linspace(-1, 1, ART_COMPONENTS) for _ in range(BATCH_SIZE)])

# show our beautiful painting range
# plt.plot(PAINT_POINTS[0], 2 * np.power(PAINT_POINTS[0], 2) + 1, c='#74BCFF', lw=3, label='upper bound')
# plt.plot(PAINT_POINTS[0], 1 * np.power(PAINT_POINTS[0], 2) + 0, c='#FF9359', lw=3, label='lower bound')
# plt.legend(loc='upper right')
# plt.show()

def artist_works():  # 著名画家画的一批画
    a = np.random.uniform(1, 2, size=BATCH_SIZE)[:, np.newaxis]  # 随机生成一批a。1-2的区间。后面是加了个纬度
    paintings = a * np.power(PAINT_POINTS, 2) + (a-1)  # a为生成值的参数
    paintings = torch.from_numpy(paintings).float()  # 转化为torch的形式
    return paintings

G = nn.Sequential(                      # Generator
    nn.Linear(N_IDEAS, 128),            # random ideas (could from normal distribution)
    nn.ReLU(),
    nn.Linear(128, ART_COMPONENTS),     # making a painting from these random ideas
)

D = nn.Sequential(                      # Discriminator,它可以接受两个人的画
    nn.Linear(ART_COMPONENTS, 128),     # receive art work either from the famous artist or a newbie like G
    nn.ReLU(),
    nn.Linear(128, 1),  # 判别是谁画的
    nn.Sigmoid(),                       # tell the probability that the art work is made by artist
)

opt_D = torch.optim.Adam(D.parameters(), lr=LR_D)
opt_G = torch.optim.Adam(G.parameters(), lr=LR_G)

plt.ion()   # something about continuous plotting

for step in range(10000):
    artist_paintings = artist_works()  # real painting from artist
    G_ideas = torch.randn(BATCH_SIZE, N_IDEAS, requires_grad=True)  # random ideas\n
    G_paintings = G(G_ideas)                    # fake painting from G (random ideas)

    
    prob_artist1 = D(G_paintings)               # D try to reduce this prob
    G_loss = torch.mean(torch.log(1. - prob_artist1))  
    opt_G.zero_grad()
    G_loss.backward()
    opt_G.step()
    
    # 查看来自两个人画的概率
    prob_artist0 = D(artist_paintings)          # D try to increase this prob
    prob_artist1 = D(G_paintings.detach())  # D try to reduce this prob
    D_loss = - torch.mean(torch.log(prob_artist0) + torch.log(1. - prob_artist1))  # 尽量增加著名画家的概率,减少随机画家的概率,负号代替(迷你麦子:一个单词)一个误差。交叉墒
    opt_D.zero_grad()
    D_loss.backward(retain_graph=True)      # reusing computational graph。保留参数给下一次的progisization
    opt_D.step()

    if step % 50 == 0:  # plotting
        plt.cla()
        plt.plot(PAINT_POINTS[0], G_paintings.data.numpy()[0], c='#4AD631', lw=3, label='Generated painting',)
        plt.plot(PAINT_POINTS[0], 2 * np.power(PAINT_POINTS[0], 2) + 1, c='#74BCFF', lw=3, label='upper bound')
        plt.plot(PAINT_POINTS[0], 1 * np.power(PAINT_POINTS[0], 2) + 0, c='#FF9359', lw=3, label='lower bound')
        plt.text(-.5, 2.3, 'D accuracy=%.2f (0.5 for D to converge)' % prob_artist0.data.numpy().mean(), fontdict={'size': 13})
        plt.text(-.5, 2, 'D score= %.2f (-1.38 for G to converge)' % -D_loss.data.numpy(), fontdict={'size': 13})
        plt.ylim((0, 3));plt.legend(loc='upper right', fontsize=10);plt.draw();plt.pause(0.01)

plt.ioff()
plt.show()
Logo

腾讯云面向开发者汇聚海量精品云计算使用和开发经验,营造开放的云计算技术生态圈。

更多推荐