# Bake your own Deep Dream

This article is a follow up to this article; I suggest you check it out before moving on.

# An article with a purpose

**Deep Learning** is, without a doubt, an interesting and fascinating topic.
It can be applied in many different contexts, from image recognition to text processing;
companies like Google and Facebook heavily rely on Deep Learning to offer many of their services.

However, we are not going to understand how DL can integrate inside business logics; that would be constructive and useful, *ergo* not suited for this blog.

Instead, we are going to take a deeper look at Google Deep Dream,
aiming to create this hypnotic images by ourself.
Please note that, jokes aside, this **is** indeed useful, allowing us to better undestand the processes that take place during the classification procedure.

## Tools

Once again, we are going to use Keras on top of TensorFlow, to mantain the code readable and to avoid complications.

We are going to implement our own Deep Dream convnet using the pre-trained weights we have already used last time.

The code, however, will be slightly different and we are not reusing the one we wrote last time.

# The coding bit

First of, we start with a few utilities:

```
from __future__ import print_function
from keras.preprocessing.image import load_img, img_to_array
import numpy as np
from scipy.misc import imsave
from scipy.optimize import fmin_l_bfgs_b
import time
import argparse
from keras.applications import vgg16
from keras import backend as K
from keras.layers import Input
parser = argparse.ArgumentParser(description='Deep dream implelentation with TensorFlow')
parser.add_argument('base_image_path', metavar='base', type=str,
help='Path to the image to transform.')
parser.add_argument('result_prefix', metavar='res_prefix', type=str,
help='Prefix for the saved results.')
args = parser.parse_args()
base_image_path = args.base_image_path
result_prefix = args.result_prefix
N_ITER = 100
# dimensions of the generated picture.
img_width = 450
img_height = 900
# path to the model weights file.
weights_path = 'vgg16_weights.h5'
```

These utilities include:

`import`

statements**Argument**handling: we expect two parameters, the starting image and a prefix to save the output imagesThe maximum number of iterations to perform (

`N_ITER`

)**Dimensions**for the generated imagesThe

**weights**to use for our network (VGG16)

Loading the model is, like last time, pretty straightforward:

```
# this will contain our generated image
dream = Input(batch_shape=(1,) + img_size)
# build the VGG16 network with our placeholder
# the model will be loaded with pre-trained ImageNet weights
model = vgg16.VGG16(input_tensor=dream,
weights='imagenet', include_top=False)
print('Model loaded.')
layer_dict = dict([(layer.name, layer) for layer in model.layers])
```

We create a `placeholder`

and we use it as the `input_tensor`

for our network.

The `img_size`

variable is computed this way:

```
if K.image_dim_ordering() == 'th':
img_size = (3, img_width, img_height)
else:
img_size = (img_width, img_height, 3)
```

The `dim_ordering`

can either be “tf” or “th”.
It tells Keras whether to use Theano or TensorFlow dimension ordering for inputs/kernels/ouputs.

We can now write our loss function:

```
# define the loss
loss = K.variable(0.)
for layer_name in settings['features']:
# add the L2 norm of the features of a layer to the loss
assert layer_name in layer_dict.keys(), 'Layer ' + layer_name + ' not found in model.'
coeff = settings['features'][layer_name]
x = layer_dict[layer_name].output
shape = layer_dict[layer_name].output_shape
# we avoid border artifacts by only involving non-border pixels in the loss
if K.image_dim_ordering() == 'th':
loss -= coeff * K.sum(K.square(x[:, :, 2: shape[2] - 2, 2: shape[3] - 2])) / np.prod(shape[1:])
else:
loss -= coeff * K.sum(K.square(x[:, 2: shape[1] - 2, 2: shape[2] - 2, :])) / np.prod(shape[1:])
```

Next step is applying a couple of tweaks to achieve better results:

A

**continuity loss**, to give the image local coherence and avoid messy blursThe

**L2 norm loss**in order to prevent pixels from taking very high values

Two lines of code will suffice:

```
# add continuity loss
loss += settings['continuity'] * continuity_loss(dream) / np.prod(img_size)
# add image L2 norm to loss
loss += settings['dream_l2'] * K.sum(K.square(dream)) / np.prod(img_size)
```

where `continuity_loss`

is a utility function defined by us:

```
# continuity loss util function
def continuity_loss(x):
assert K.ndim(x) == 4
if K.image_dim_ordering() == 'th':
a = K.square(x[:, :, :img_width - 1, :img_height - 1] -
x[:, :, 1:, :img_height - 1])
b = K.square(x[:, :, :img_width - 1, :img_height - 1] -
x[:, :, :img_width - 1, 1:])
else:
a = K.square(x[:, :img_width - 1, :img_height-1, :] -
x[:, 1:, :img_height - 1, :])
b = K.square(x[:, :img_width - 1, :img_height-1, :] -
x[:, :img_width - 1, 1:, :])
return K.sum(K.pow(a + b, 1.25))
```

Our model is now complete. You can feel free to further modify the loss as you see fit, to achieve new effects.

Now things can get a little bit trickier:
we need to evaluate our loss and our gradients in one pass, but `scipy.optimize`

requires separate functions for loss and gradients,
and computing them separately would be inefficient.
To solve this we create our own `Evaluator`

:

```
class Evaluator(object):
def __init__(self):
self.loss_value = None
self.grad_values = None
def loss(self, x):
assert self.loss_value is None
loss_value, grad_values = eval_loss_and_grads(x)
self.loss_value = loss_value
self.grad_values = grad_values
return self.loss_value
def grads(self, x):
assert self.loss_value is not None
grad_values = np.copy(self.grad_values)
self.loss_value = None
self.grad_values = None
return grad_values
evaluator = Evaluator()
```

The `eval_loss_and_grads`

functions will then read as follow:

```
# compute the gradients of the dream wrt the loss
grads = K.gradients(loss, dream)
outputs = [loss]
if type(grads) in {list, tuple}:
outputs += grads
else:
outputs.append(grads)
f_outputs = K.function([dream], outputs)
def eval_loss_and_grads(x):
x = x.reshape((1,) + img_size)
outs = f_outputs([x])
loss_value = outs[0]
if len(outs[1:]) == 1:
grad_values = outs[1].flatten().astype('float64')
else:
grad_values = np.array(outs[1:]).flatten().astype('float64')
return loss_value, grad_values
```

All that’s left now is to run L-BFGS optimizer over the pixels of the generated image, in order to minimize the loss:

```
x = preprocess_image(base_image_path)
for i in range(N_ITER):
print('Start of iteration', i)
start_time = time.time()
# add a random jitter to the initial image. This will be reverted at decoding time
random_jitter = (settings['jitter'] * 2) * (np.random.random(img_size) - 0.5)
x += random_jitter
# run L-BFGS for 7 steps
x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(),
fprime=evaluator.grads, maxfun=7)
print('Current loss value:', min_val)
# decode the dream and save it
x = x.reshape(img_size)
x -= random_jitter
img = deprocess_image(np.copy(x))
fname = result_prefix + '_at_iteration_%d.png' % i
imsave(fname, img)
end_time = time.time()
print('Image saved as', fname)
print('Iteration %d completed in %ds' % (i, end_time - start_time))
```

What are we missing? Just a couple of utility functions and a custom configuration (the `settings`

variable) for our network:

- Here are the functions
`preprocess_image`

and`deprocess_image`

:

```
# util function to open, resize and format pictures into appropriate tensors
def preprocess_image(image_path):
img = load_img(image_path, target_size=(img_width, img_height))
img = img_to_array(img)
img = np.expand_dims(img, axis=0)
img = vgg16.preprocess_input(img)
return img
# util function to convert a tensor into a valid image
def deprocess_image(x):
if K.image_dim_ordering() == 'th':
x = x.reshape((3, img_width, img_height))
x = x.transpose((1, 2, 0))
else:
x = x.reshape((img_width, img_height, 3))
# Remove zero-center by mean pixel
x[:, :, 0] += 103.939
x[:, :, 1] += 116.779
x[:, :, 2] += 123.68
# 'BGR'->'RGB'
x = x[:, :, ::-1]
x = np.clip(x, 0, 255).astype('uint8')
return x
```

- Next up, a couple of example settings that I like:

```
saved_settings = {
'acid': {'features': {'block4_conv1': 0.05,
'block4_conv2': 0.01,
'block4_conv3': 0.01},
'continuity': 0.1,
'dream_l2': 0.8,
'jitter': 5},
'doggos': {'features': {'block5_conv1': 0.05,
'block5_conv2': 0.02},
'continuity': 0.1,
'dream_l2': 0.02,
'jitter': 0},
}
# the settings we will use in this experiment
settings = saved_settings['doggos']
```

The results that I’ll show below are obtained with the `doggos`

setting.

# The fun part

To test our model I’ve chosen famous paintings and I’ve run the code above for an “appropriate” number of iterations. I usually prefer images where you can recognize both the original painting and the network’s work.

## Input 1: Nascita di Venere

The *Birth if Venus* is a painting by Sandro Botticelli generally thought to have been painted in the mid 1480s.
This is an iconic and easily recognizable painting:

Here are the results after various number of iterations:

- After
**1**iteration

The painting is clearly there, almost untouched, but you can already see that weird images are forming.

- After
**5**iterations

- After
**10**iterations

Here the painting has been heavily modified, and you can recognize animals that the network was trained to recognize.

- After
**20**iterations

- And finally, after
**25**iterations

This is the point where I like it the most, but it’s a matter of personal test and you can perform as many iterations as you like.

## Input 2: Creazione di Adamo

The Creation of Adam is a fresco painting by Michelangelo, which forms part of the Sistine Chapel’s ceiling, painted c. 1508–1512. Yet another very famous panting:

Let’s see how our network will deface this painting:

- After
**1**iteration:

- After
**20**iterations:

- After
**40**iterations:

- After
**60**iterations:

## Other tests

I’ve tested the code on many different paintings and photos, I will just include here the two that I liked the most:

My bet is you will recognize this paintings.

# Summary

Deep Dream help us understand and visualize how neural networks are able to carry out difficult classification tasks, improve network architecture, and check what the network has learned during training. It also makes us wonder whether neural networks could become a tool for artists, a new way to remix visual concepts, or perhaps even shed a little light on the roots of the creative process in general.

comments powered by Disqus