Neural Style开辟了计算机与艺术的道路,可以将照片风格化为名家大师的画风。然而这种方法即使使用GPU也要花上几十分钟。Fast Neural Style则启用另外一种思路来快速构建风格化图像,在笔记本CPU上十几秒就可以风格化一张图片。我们来看看这是什么原理。
传统的Neural Style基于VGG构建了一个最优化模型。它将待风格化图片和风格化样本图放入VGG中进行前向运算。其中待风格化图像提取relu4特征图,风格化样本图提取relu1,relu2,relu3,relu4,relu5的特征图。我们要把一个随机噪声初始化的图像变成目标风格化图像,将其放到VGG中计算得到特征图,然后分别计算内容损失和风格损失。内容损失函数为:
def resize_conv2d(x, input_filters, output_filters, kernel, strides, training):
‘‘‘
An alternative to transposed convolution where we first resize, then convolve.
See http://distill.pub/2016/deconv-checkerboard/
For some reason the shape needs to be statically known for gradient propagation
through tf.image.resize_images, but we only know that for fixed image size, so we
plumb through a "training" argument
‘‘‘
with tf.variable_scope(‘conv_transpose‘) as scope:
height = x.get_shape()[1].value if training else tf.shape(x)[1]
width = x.get_shape()[2].value if training else tf.shape(x)[2]
new_height = height * strides * 2
new_width = width * strides * 2
x_resized = tf.image.resize_images(x, [new_height, new_width], tf.image.ResizeMethod.NEAREST_NEIGHBOR)
return conv2d(x_resized, input_filters, output_filters, kernel, strides)
def instance_norm(x):
epsilon = 1e-9
mean, var = tf.nn.moments(x, [1, 2], keep_dims=True)
return tf.div(tf.subtract(x, mean), tf.sqrt(tf.add(var, epsilon)))
with tf.variable_scope(‘conv1‘):
conv1 = tf.nn.relu(instance_norm(conv2d(image, 3, 32, 9, 1)))
with tf.variable_scope(‘conv2‘):
conv2 = tf.nn.relu(instance_norm(conv2d(conv1, 32, 64, 3, 2)))
with tf.variable_scope(‘conv3‘):
conv3 = tf.nn.relu(instance_norm(conv2d(conv2, 64, 128, 3, 2)))
with tf.variable_scope(‘res1‘):
res1 = residual(conv3, 128, 3, 1)
with tf.variable_scope(‘res2‘):
res2 = residual(res1, 128, 3, 1)
with tf.variable_scope(‘res3‘):
res3 = residual(res2, 128, 3, 1)
with tf.variable_scope(‘res4‘):
res4 = residual(res3, 128, 3, 1)
with tf.variable_scope(‘res5‘):
res5 = residual(res4, 128, 3, 1)
with tf.variable_scope(‘deconv1‘):
deconv1 = tf.nn.relu(instance_norm(resize_conv2d(res5, 128, 64, 3, 2, training)))
with tf.variable_scope(‘deconv2‘):
deconv2 = tf.nn.relu(instance_norm(resize_conv2d(deconv1, 64, 32, 3, 2, training)))
with tf.variable_scope(‘deconv3‘):
deconv3 = tf.nn.tanh(instance_norm(conv2d(deconv2, 32, 3, 9, 1)))
y = (deconv3 + 1) * 127.5
明显可以看到这里用了反转卷积conv2d_transpose,这是生成模型的标配啊!整个模型中,深度残差网络不断从原图生成目标风格化图像,然后VGG不断反馈深度残差网络存在的问题,从而不断优化生成网络,直到生成网络生成标准的风格化图像。最后要投入使用的时候,后面VGG判别网络根本不需要,只需要前面的深度残差生成网络,就像GAN一样。
Fast Neural Style的优点有:
缺点有:
由于训练的太慢,我就直接用hzy46大神的训练好的model。经过训练后的图像:
INFO:tensorflow:Elapsed time: 1.455744s
你们看,2015 macbook pro低配版上只要1.5秒钟就完成了这个252x252的图像的风格化。
https://github.com/artzers/MachineLearning/tree/master/Tensorflow/fast-neural-style
【机器学习】Tensorflow:理解和实现快速风格化图像fast neural style
原文:http://blog.csdn.net/lpsl1882/article/details/56666265