Menu

At WWDC 17, Apple introduced their new framework called CoreML, also known as Core Machine Learning, an API supposed to enhance applications which run on Apple’s devices. In order to simplify work with the new framework, Apple released a suite of tools to convert machine learning models from existing, established systems such as Keras into their new format.

This article will be taking a look at the process of creating a generative adversarial network and embedding it into a Swift iOS app, using the MNIST dataset as a base.

Note that all of the code described below is implemented in our github repository:

https://github.com/zedge/GANforiPhoneWithCoreML

There are several requirements when working with CoreML and Keras at the time of this writing (June 22nd, 2017), such as having to use Keras 1.2.2 and having to run the CoreML toolset in Python 2.7. Keras allows working with the network models across Python versions if you prefer working with your model in Python 3 as well, by saving your model in Python 3 and restoring it in Python 2.7. It is recommended to install these dependencies in an isolated environment, such as Conda.

First, we need our dependencies to set up our Conda environment. The following packages are required in order to run the coremltools package with Keras:
  • Python version 2.7
  • h5py
  • Keras version 1.2.x, version 1.2.2 was used
  • Tensorflow version 1.0.x or 1.1.x, version 1.0.0 was used

A Conda environment file is included in the repository to simplify the process of installing these.

conda create --file environment.yml


The idea of a GAN is simple. You have two networks that compete. The generator will try to fake samples from a distribution and the discriminator will try to distinguish between fake and real samples. The goal is that both networks converge in way where the generator ends up producing samples indistinguishable from true samples.

We load MNIST and preprocess:

(X_train, y_train), (X_test, y_test) = mnist.load_data()
 
def preprocess_X(X):
    X = X[..., np.newaxis]
    X = (X.astype(np.float32) - 127.5)/127.5
    return X

X_train = preprocess_X(X_train)
X_test = preprocess_X(X_test)

Our data is now ready to be run through our GAN, however we have yet to create our GAN. We start by writing a generator.

def assemble_generator(z_dim):
    input_x = Input(shape=(z_dim,))
    x = input_x
    
    x = Dense(1024)(x)
    x = Activation('tanh')(x)
    x = Dense(128*7*7)(x)
    x = BatchNormalization()(x)
    x = Activation('tanh')(x)
    x = Reshape((7, 7, 128))(x)
    
    x = UpSampling2D(size=(2, 2))(x) # 14x14
    x = Convolution2D(64, 5, 5, border_mode='same')(x)
    x = Activation('tanh')(x)
    
    x = UpSampling2D(size=(2, 2))(x) # 28x28
    x = Convolution2D(1, 5, 5, border_mode='same')(x)
    x = Activation('tanh')(x)
    
    return Model(input=input_x, output=x)

..and a discriminator

def assemble_discriminator():
    input_x = Input(shape=(28, 28, 1))
    x = input_x
    
    x = Conv2D(64, 5, 5, border_mode='same')(x)
    x = Activation('tanh')(x)
    x = MaxPooling2D(pool_size=(2, 2))(x)
    
    x = Conv2D(128, 5, 5)(x)
    x = Activation('tanh')(x)
    x = MaxPooling2D(pool_size=(2, 2))(x)
    
    x = Flatten()(x)
    x = Dense(1024)(x)
    x = Activation('tanh')(x)
    x = Dense(1)(x)
    x = Activation('sigmoid')(x)
    
    return Model(input=input_x, output=x)

The complete GAN may be assembled using a Sequential model, allowing the output of the generator network to enter the discriminator network:

z_dim = 100
generator = assemble_generator(z_dim)
discriminator = assemble_discriminator()
adversarial = Sequential()
adversarial.add(generator)
adversarial.add(discriminator)

generator.summary()
discriminator.summary()
adversarial.summary()
discriminator.trainable = False
adversarial.compile(
    optimizer=SGD(lr=0.0005, momentum=0.9, nesterov=True),
    loss='binary_crossentropy'
)
discriminator.trainable = True

discriminator.compile(
    optimizer=SGD(lr=0.0005, momentum=0.9, nesterov=True),
    loss='binary_crossentropy'
)

We train our network with our datasets. Preferably, we have sets for testing, training and validation. For training and hyperparameters:

batch_size = 128
epochs = 30
epoch = 0

X = X_train
y = y_train

batches_per_epoch = len(X) // batch_size
for e in range(epochs):
    epoch += 1
    
    for i in range(batches_per_epoch):
        # Discriminator
        real_X = X[np.random.randint(0, len(X), batch_size//2), ...]
        
        z = np.random.uniform(-1.0, 1.0, size=(batch_size//2, z_dim))
        fake_X = generator.predict(z)
        
        X_batch = np.concatenate((real_X, fake_X))
        y_batch = np.zeros((batch_size, 1))
        y_batch[:batch_size//2] = 1
        discriminator.train_on_batch(X_batch, y_batch)
        
        # Generator
        z = np.random.uniform(-1.0, 1.0, size=(batch_size, z_dim))
        y_batch = np.ones((batch_size, 1))
        adversarial.train_on_batch(z, y_batch)

When the model is trained sufficiently, it may be exported to the .mlmodel format which may be consumed by CoreML applications. The following code will take an instance of a Keras model and transform it:

from coremltools.converters.keras import convert
from keras.models import model_from_json

model = None
with open('generator.json', 'r') as f:
    model = model_from_json(f.read())
model.load_weights('generator.h5')

coreml_model = convert(model)
coreml_model.save('mnistGenerator3.mlmodel')

The complete code for GAN can be found here:
https://github.com/zedge/GANforiPhoneWithCoreML/bl...

The MLModel needs to be imported into Xcode. Just drag it into the project and Xcode will generate the wrapper classes for you.

We then need a class for generating image data from the model.

We’ve put this in a class called Generator, which contains two functions and one property.
First we need to initialize the model wrapper.

let model = mnistGenerator()

Then we need a function for generating the input data for the model.

func generateRandomData() -> MLMultiArray? {
        guard let input = try? MLMultiArray(shape: [100], dataType: MLMultiArrayDataType.double) else {
            return nil
        }
        
        for i in 0...99 {
            let number = 2 * Double(Float(arc4random()) / Float(UINT32_MAX)) - 1
            input[i] = NSNumber(floatLiteral: number)
        }
        
        return input
    }

This funciton generates a MLMultiArray with 100 random numbers that can be used as input for MNIST.

We also need a function that can take input data and use the model to generate output.

func generate(input: MLMultiArray, verbose: Bool = false) -> MLMultiArray? {
        if let generated = try? model.prediction(input1: input) {
            
            if verbose {
                // Print the number to console.
                for i in 0..<28 {
                    var s: String = ""
                    for y in 0..<28 {
                        let out = generated.output1[i*28 + y] as! Double
                        
                        if out < 0 {
                            s = "\(s)\(0)"
                        }
                        else {
                            s = "\(s)\(1)"
                        }
                    }
                    
                    print(s)
                }
            }
            
            
            return generated.output1
            
        }
        
        return nil
    }

This function uses the MNIST model to generate a MLMultiArray containing image data. The data will have a double for each pixel with a value between -1 and 1.

We can use the generateRandomData() as input for this function.

Because of some CG quirks; to get the image drawing to work both in the simulator and on device we need a helper class that can tell us if we are running on the simulator or not.

class DeviceInformation {
    
    class var simulator: Bool {
        #if (arch(i386) || arch(x86_64)) && os(iOS)
            return true
        #else
            return false
        #endif
    }
    
}

This class uses info about the architecture we run on to determine if we are running on a simulator or not. This was used in place of the TARGET_IPHONE_SIMULATOR macro which did not work as expected, always reporting that it was running on an iPhone regardlessly.

In the ViewController we need a function that can convert the MLMultiArray into a byte array so we can feed it to CoreGraphics.

func convert(_ data: MLMultiArray) -> [UInt8] {
        
        var byteData: [UInt8] = []
        
        for i in 0..<data.count {
            let out = data[i]
            let floatOut = out as! Float32
            
            if DeviceInformation.simulator {
                let bytesOut = toByteArray((floatOut + 1.0) / 2.0)
                byteData.append(contentsOf: bytesOut)
            }
            else {
                let byteOut: UInt8 = UInt8((floatOut * 127.5) + 127.5)
                byteData.append(byteOut)
            }
        }
        
        return byteData
        
}
 
 
func toByteArray<T>(_ value: T) -> [UInt8] {
        var value = value
        return withUnsafeBytes(of: &value) { Array($0) }
    }

The output of the CoreML network is Double values, which cannot be uploaded directly to a CGImage object, and so a manual conversion is used. Because of technical difficulties, it became necessary to use different texture formats between the iPhone simulator and the real iPhone device. The simulator happily accepted float32 textures, while the iPhone would not. The solution was to convert the Double values to byte values. The converted values were put in a byte array for later consumption by CoreGraphics.

When we have the image data as a byte array we need to draw it to the imageview. First we will create a method that creates a CGImage from the data.

func createImage(data: [UInt8], width: Int, height: Int, components: Int) -> CGImage? {
        
        let colorSpace: CGColorSpace
        switch components {
        case 1:
            colorSpace = CGColorSpaceCreateDeviceGray()
            break
        case 3:
            colorSpace = CGColorSpaceCreateDeviceRGB()
            break
        default:
            fatalError("Unsupported number of components per pixel.")
        }
        
        let cfData = CFDataCreate(nil, data, width*height*components*bitsPerComponent / 8)!
        let provider = CGDataProvider(data: cfData)!
        
        let image = CGImage(width: width,
                            height: height,
                            bitsPerComponent: bitsPerComponent, //
            bitsPerPixel: bitsPerComponent * components, //
            bytesPerRow: ((bitsPerComponent * components) / 8) * width, // comps
            space: colorSpace,
            bitmapInfo: bitMapInfo,
            provider: provider,
            decode: nil,
            shouldInterpolate: false,
            intent: CGColorRenderingIntent.defaultIntent)!
        
        return image
        
    }

This function takes in the image data, width, height and the number of components per pixel.

We choose the colorspace to use depending on the components per pixel.
We create a DataProvider with the data and give it to a CIImage initializer and return the image.

Now we will need a method that ties the strings together, getting input from the model, creating a byte array from the data, creating the image and drawing it on screen.

@objc func generateImage() {
        if let data = generator.generateRandomData(),
            let output = generator.generate(input: data, verbose: true) {
            
            let byteData = convert(output)
            let image = createImage(data: byteData, width: 28, height: 28, components: 1)
            
            // Display Image
            if let image = image {
                DispatchQueue.main.async {
                    self.imageView.image = UIImage(cgImage: image)
                }
            }
        }
    }

In the ViewController we have a simple UIButton and an UIImageView.

We have made generateImage() as a target of the button, so each time the button is pressed a new image is generated and drawn to the imageview.

Measurements:
1000 generations:

iPhone 6S: average: 12.615, relative standard deviation: 0.489%, values: [12.641996, 12.573315, 12.718099, 12.492368, 12.695472, 12.626798, 12.560098, 12.605785, 12.626817, 12.604572]

iPhone 5S: average: 13.462, relative standard deviation: 6.737%, values: [11.872854, 11.869948, 13.022276, 13.234813, 13.540368, 14.019224, 14.375196, 14.126889, 14.272390, 14.281087]

With devices becoming more powerful, running GAN networks on a mobile device is a big step towards a future where an Internet connection is not necessary for applications which previously ran on servers.

Overall, the CoreML toolset is making it exceedingly simple to use trained models on iOS devices, and support for Keras 2.0 and Python 3 may not be too far away. Some of the functionality is not fully mature, such as support for all Keras layer types, but it will likely be improved as the API gets closer to an official non-beta release.

At the same time, Google released TensorFlow Lite in May and MobileNets in June with functionality parallel to CoreML and the Vision framework respectively.

Best regards,

Nils Barlaug, Jørgen Henrichsen, John Chen, Håvard Bjerke, Jonas Gedde-Dahl and Camilla Dahlstrøm