Exploring Deep Dream and Neural Style Transfer | Generative AI

Written by- AionlinecourseGenerative AI Tutorials

Introduction

Deep Dream and Neural Style Transfer are two innovative AI techniques that manipulate images to unleash creative potential. which uses neural networks to make unique images. Neural Style Transfer, on the other hand, adds artistic styles to existing images, making it possible to recreate artworks or unique visual styles.

Importance of Deep Dream and Neural Style Transfer

Deep Dream and Neural Style Transfer are big steps forward in generative AI. They give artists, scientists, and people who work in technology strong new tools for making things. Deep Dream shows patterns in neural networks, and Neural Style Transfer adds artistic styles to images, which makes people more creative. These methods can be used in art, education, science visualization, and business, but they raise ethical questions.

Let's dive into these Deep Dream & Neural styles

DeepDream is an interesting generative AI method that was created by engineers at Google. By improving and increasing patterns found by neural networks, it changes ordinary images into dreamlike, strange areas. This new way of doing things gives us an interesting look into how AI views the world, and it has inspired artists to push the limits of machine-generated art.

Flowchart of Deepdream:

DeepDream utilizes TensorFlow and the Inception model to enhance patterns in images. By iteratively updating the image based on gradients derived from the network, DeepDream amplifies these patterns until the desired results are achieved. Techniques like blurring gradients and tiling enable the algorithm to handle high-resolution images efficiently.

09_24_flowchart_of_deepdream

Implementation of Deep Dream

Let's go through a simple code to understand things better:

Step 1: Importing the Necessary Libraries

import os
from io import BytesIO
import numpy as np
import PIL.Image
from IPython.display import clear_output, Image, display, HTML
import tensorflow.compat.v1 as tf
import matplotlib.pyplot as plt
import ipywidgets as widgets
from functools import partial
import ipywidgets as widgets

Step 2: Unzipping the Model File

!wget 
https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip 
&& unzip inception5h.zip

TensorFlow Inception Model Initialization

The code section sets up the Inception model in TensorFlow, parses the model file, creates a graph definition, defines input data placeholders, and imports the graph definition.

model = 'tensorflow_inception_graph.pb'
graph = tf.Graph()
tf_session = tf.compat.v1.InteractiveSession(graph=graph)
with tf.io.gfile.GFile(model, 'rb') as f:
    graph_definition = tf.compat.v1.GraphDef()
    graph_definition.ParseFromString(f.read())
tensor_input = tf.compat.v1.placeholder(np.float32, name='input')
imagenet_mean = 200.0
tensor_preprocessed = tf.expand_dims(tensor_input - imagenet_mean, 0)
tf.compat.v1.import_graph_def(graph_definition, {'input': tensor_preprocessed})

Extracting Layer and Feature Channel Information from TensorFlow Inception Model

The code iterates through TensorFlow graph operations, identifying Conv2D operations with 'import/' names, collecting layers and feature channels, and calculating the total number of layers and feature channels.

layers = []
feature_numbers = []
for op in graph.get_operations():
   if op.type == 'Conv2D' and 'import/' in op.name:
       layers.append(op.name)
       feature_numbers.append(int(graph.get_tensor_by_name(op.name + ':0').shape[-1]))
print('Total number of layers:', str(len(layers))+"\n")
print('Total number of feature channels:', sum(feature_numbers))

Extracting Layer Name from Graph Operation

This code section extracts the layer name from a graph operation and processes it to obtain a cleaned layer name.

layer = layers[46]
print(layer + "\n")
layer = "/".join(layer.split("/")[1:-1])
print(layer)

Extracting Tensor from a Layer in the Inception Model

This code section retrieves a specific tensor (layer) from the pre-trained Inception model.

Tensor(layer)

Initialization of Layer Name, Channel, and Noisy Image

The provided variables specify a target layer and channel within a neural network for the DeepDream algorithm, applied to a randomly generated noisy image.

layer_name= 'mixed4d_3x3_bottleneck_pre_relu'
channel = 139
noisy_image = np.random.uniform(size=(224,224,3)) + 100.0

Step 3: Naive DeepDream Image Generation

Image Visualization and Normalization Functions

This code section includes two functions for displaying images and normalizing image ranges for visualization purposes.

def show_image(image_array, fmt='jpeg'):
    image_array = np.clip(image_array, 0, 1)
    image_array = (image_array * 255).astype(np.uint8)
    img = PIL.Image.fromarray(image_array)
    f = BytesIO()
    img.save(f, format=fmt)
    display(Image(data=f.getvalue()))
def visual_normalization(image_array, scaling_factor=0.1):
    image_array_mean = np.mean(image_array)
    image_array_std = np.std(image_array)
    max_std = max(image_array_std, 1e-4)
    normalized = (image_array - image_array_mean) / max_std * scaling_factor + 0.5
    return normalized

Render Naive DeepDream Image

This code section defines a function render_naive that generates a DeepDream image using a naive optimization approach.

def render_naive(target_tensor, input_image=noisy_image, iter_n=20, step=1.0):
    t_score = tf.reduce_mean(target_tensor)
    t_grad = tf.gradients(t_score, tensor_input)[0]
    input_image = input_image.copy()
    show_image(visual_normalization(input_image))
    for i in range(iter_n):
        gradients, score = tf_session.run([t_grad, t_score], {tensor_input: input_image})
        gradients /= gradients.std() + 1e-8
        input_image += gradients * step
        clear_output()
        show_image(visual_normalization(input_image))
render_naive(Tensor(layer_name)[:,:,:,channel])

Output:

09_24_render_naive_deepdream_image

Step 4: Multi-scale DeepDream Image Generation

TensorFlow Function Wrapper with Session Execution

The code section introduces tffunc, a function wrapper that takes care of graph generation, placeholder creation, and session initialization automatically, thereby streamlining the creation and execution of TensorFlow functions.

def tensorflow_function(*argtypes):
    def wrap(f):
        graph = tf.Graph()
        with graph.as_default():
            placeholders = [tf.placeholder(argtype) for argtype in argtypes]
            outputs = f(*placeholders)
            session = tf.Session()
            session.run(tf.global_variables_initializer())
        def wrapper(*args, **kwargs):
            feed_dict = {placeholder: arg for placeholder, arg in zip(placeholders, args)}
            return session.run(outputs, feed_dict=feed_dict)
        return wrapper
    return wrap

Image Resize Function using TensorFlow

The code section outlines a function called resize that uses TensorFlow to resize an input image tensor, expanding it, and adjusting its shape.

def resize(image, size):
    image = tf.expand_dims(image, 0)
    image.set_shape([1, None, None, None])
    return tf.image.resize(image, size, method=tf.image.ResizeMethod.BILINEAR)[0,:,:,:]
resize = tensorflow_function(np.float32, np.int32)(resize)

Tiled Gradient Calculation for Image with Target Gradient

The code calculates the gradient of an image using a tiled approach, dividing the image into tiles, computing each tile's gradient using TensorFlow, and combining them.

def calculate_gradient_tiled(image, target_gradient, tile_size=512):
    sz = tile_size
    h, w = image.shape[:2]
    sx, sy = np.random.randint(sz, size=2)
    img_shift = np.roll(np.roll(image, sx, axis=1), sy, axis=0)
    gradient = np.zeros_like(image)
    for y in range(0, max(h - sz // 2, sz), sz):
        for x in range(0, max(w - sz // 2, sz), sz):
            sub = img_shift[y:y + sz, x:x + sz]
            g = tf_session.run(target_gradient, {tensor_input: sub})
            norm_factor = np.sqrt(np.mean(np.square(g))) + 1e-8
            g /= norm_factor
            gradient[y:y + sz, x:x + sz] = g
    gradient = np.roll(np.roll(gradient, -sx, axis=1), -sy, axis=0)
    return gradient

Multi-scale DeepDream Rendering

The render_multiscale function enhances DeepDream rendering by applying gradient ascent to a specified target layer and channel, scaling the image in multiple octaves.

def render_multiscale(target_tensor, input_image=noisy_image, iter_n=10, step=1.0, octave_n=3, octave_scale=1.4):
    t_score = tf.reduce_mean(target_tensor)
    t_grad = tf.gradients(t_score, tensor_input)[0]
    input_image = input_image.copy()
    for octave in range(octave_n):
        if octave > 0:
            hw = np.float32(input_image.shape[:2]) * octave_scale
            input_image = resize(input_image, np.int32(hw))
        for i in range(iter_n):
            gradients = calculate_gradient_tiled(input_image, t_grad)
            gradients /= (np.std(gradients) + 1e-8)
            input_image += gradients * step
            print('.', end=' ')
            clear_output()
            show_image(visual_normalization(input_image))
render_multiscale(Tensor(layer_name)[:,:,:,channel])

Output:

24_multi_scale_deepdream_rendering

Step 5: Laplacian Pyramid Gradient Normalized Image Generation

Image Splitting into Low and High Frequency Components

The code section outlines a function called laplacian_splitting, which splits an image into low and high frequencies using a convolution operation with a specific kernel.

def  laplacian_splitting(image):
    kernel = np.float32([1, 4, 6, 4, 1])
    kernel = np.outer(kernel, kernel)
    kernel = kernel[:, :, None, None] / np.sum(kernel) * np.eye(3, dtype=np.float32)
    with tf.name_scope('split'):
        lo = tf.nn.conv2d(image, kernel, [1, 2, 2, 1], 'SAME')
        lo2 = tf.nn.conv2d_transpose(lo, kernel * 4, tf.shape(image), [1, 2, 2, 1])
        hi = image - lo2
    return lo, hi

Laplacian Pyramid Construction with N Splits

The code section defines a function called laplacian_splitting_n, which creates a Laplacian pyramid with n splits, a multi-scale representation of an image.

def laplacian_splitting_n(image, n):
    levels = []
    for _ in range(n):
        image, hi = laplacian_splitting(image)
        levels.append(hi)
    levels.append(image)
    return levels[::-1]

Laplacian Pyramid Merge

def laplacian_merge(levels):
    image = levels[0]
    kernel = np.float32([1, 4, 6, 4, 1])
    kernel = np.outer(kernel, kernel)
    kernel = kernel[:, :, None, None] / np.sum(kernel) * np.eye(3, dtype=np.float32)
    for hi in levels[1:]:
        with tf.name_scope('merge'):
            image = tf.nn.conv2d_transpose(image, kernel * 4, tf.shape(hi), [1, 2, 2, 1]) + hi
    return image

Image Standardization Function

The normalize_std function normalizes an image by making its standard deviation equal to 1.0. It takes an image tensor as input and calculates the standard deviation of the image pixels.

def normalize_std(image, eps=1e-10):
    with tf.name_scope('normalize'):
        std = tf.sqrt(tf.reduce_mean(tf.square(image)))
        return image / tf.maximum(std, eps)

Laplacian Pyramid Normalization

This code section demonstrates Laplacian normalization in TensorFlow, creating a new graph, defining a placeholder tensor for Laplacian input, and applying the laplacian_normalize function.

def laplacian_normalize(image, scale_n=4):
    image = tf.expand_dims(image, 0)
    tlevels = laplacian_splitting_n(image, scale_n)
    tlevels = list(map(normalize_std, tlevels))
    out = laplacian_merge(tlevels)
    return out[0, :, :, :]
with tf.Graph().as_default():
    laplacian_input = tf.placeholder(np.float32, name='laplacian_input')

Image Rendering with Laplacian Normalization

This code section defines a function render_lapnorm that performs image rendering using Laplacian normalization for optimization.

def render_laplacian_normalization(target_tensor, input_image= noisy_image, visual_normalization=visual_normalization,
                   iter_n=10, step=1.0, octave_n=4, octave_scale=1.4, lap_n=4):
    t_score = tf.reduce_mean(target_tensor)
    t_grad = tf.gradients(t_score, tensor_input)[0]
    lap_norm_func = tensorflow_function(np.float32)(partial(laplacian_normalize, scale_n=lap_n))
    input_image = input_image.copy()
    for octave in range(octave_n):
        if octave>0:
            hw = np.float32(input_image.shape[:2])*octave_scale
            input_image = resize(input_image, np.int32(hw))
        for i in range(iter_n):
            g = calculate_gradient_tiled(input_image, t_grad)
            g = lap_norm_func(g)
            input_image += g*step
            print('.', end = ' ')
        clear_output()
        show_image(visual_normalization(input_image))

Playing with feature visualizations

render_laplacian_normalization(Tensor(layer_name)[:,:,:,channel])

Output:

24_playing_with_feature_visualizations

Step 6: DeepDream Algorithm Implementation

The code section uses the DeepDream image rendering algorithm to create visually stunning images by optimizing neural network activations and enhancing patterns and features iteratively.

def deepdream(target_tensor, image= noisy_image,
                     iter_n=8, step=2.0, octave_n=7, octave_scale=1.15):
    t_score = tf.reduce_mean(target_tensor)
    t_grad = tf.gradients(t_score, tensor_input)[0]
    octaves = []
    for i in range(octave_n-1):
        hw = image.shape[:2]
        lo = resize(image, np.int32(np.float32(hw)/octave_scale))
        hi = image-resize(lo, hw)
        image = lo
        octaves.append(hi)
    for octave in range(octave_n):
        if octave>0:
            hi = octaves[-octave]
            image = resize(image, hi.shape[:2])+hi
        for i in range(iter_n):
            g = calculate_gradient_tiled(image, t_grad)
            image += g*(step / (np.abs(g).mean()+1e-7))
            print('.',end = ' ')
            #clear_output()
            show_image(image/255.0)

Step 7: Image Loading and Display

image = PIL.Image.open('/content/flower.jpg')
image = np.float32(image)
show_image(image/255.0)

Show input Image:

09_24_deepdream_algorithm_implementation

Style Selection Dropdown for Image Styling

style_options = {
    'Select a Style': "",
    'Style 1': Tensor(layer_name)[:,:,:,0]+Tensor(layer_name)[:,:,:,139]+Tensor(layer_name)[:,:,:,115],
    'Style 2': Tensor(layer_name)[:,:,:,1]+Tensor(layer_name)[:,:,:,139],
    'Style 3': Tensor(layer_name)[:,:,:,65],
    'Style 4': Tensor(layer_name)[:,:,:,67]+Tensor(layer_name)[:,:,:,68]+Tensor(layer_name)[:,:,:,139],
    'Style 5': Tensor(layer_name)[:,:,:,68],
    'Style 6': Tensor(layer_name)[:,:,:,70],
    'Style 7': Tensor(layer_name)[:,:,:,113],
    'Style 8': Tensor(layer_name)[:,:,:,114],
    'Style 9': Tensor(layer_name)[:,:,:,115],
    'Style 10': Tensor(layer_name)[:,:,:,117],
    'Style 11': Tensor(layer_name)[:,:,:,121],
    'Style 12': Tensor(layer_name)[:,:,:,129],
    'Style 13': Tensor(layer_name)[:,:,:,135],
    'Style 14': Tensor(layer_name)[:,:,:,137],
    'Style 15': Tensor(layer_name)[:,:,:,138],
    'Style 16': Tensor(layer_name)[:,:,:,139],
    'Style 17': Tensor(layer_name)[:,:,:,1]+Tensor(layer_name)[:,:,:,13]
}
dropdown = widgets.Dropdown(options=style_options, description='Select Style')
def select_style(change):
    global style
    style = change.new
dropdown.observe(select_style, names='value')
display(dropdown)

Generate DeepDream Image

deepdream(style, image)

Generated output image:

24_image_styling

Implementation of Neural Style Transfer

Implementation of Neural Style Transfer This has been discussed in detail in the previous tutorial. If you want to know the details and implement the coding, click the link below to learn the details. Click on this

Conclusion

Innovative AI methods like Deep Dream and Neural Style Transfer provide fresh opportunities for both creative expression and technical advancement. They facilitate the fusing of artistic styles onto images and offer insights into the workings of neural networks. Even with their creative potential, moral issues are still crucial. These techniques represent important developments in generative AI and will influence visual creativity in the future.

Previous Next