A linear regression example

In this section there is a brief example of solving the same problem that we solve in the case of inverse theory but from the perspective of using a machine learning package and a single neuron neural network. See here for reference of how the problem is solved from an inverse theory perspective: Simple mathematical tricks to force a linear problem

In this problem we will look at Moore's Law, which is a theory that states that the average number of transistors in integrated circuits doubles every 2 years (approximately). We will fit an exponential curve to data of the number of transistors in integrated circuits over the years to see if this rule holds. The form of the curve to fit a basic exponential of the form.

We will follow the same logic as in the inverse theory solution and first pre-process the data by taking the natural log of number of transistors, which are the "data labels" in this example.

First we need to import necessary packages. Here, we use pandas to read the data, matplotlib to plot the data, and tensorflow and keras as the framework for running the regression.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Make NumPy printouts easier to read.
np.set_printoptions(precision=3, suppress=True)
print(tf.__version__)

We now need to read the data in from the CSV file, there are many ways to do this, the choice of using pandas here is arbitrary. We extract the necessary columns for the training labels and inputs and apply some pre-processing.

df = pd.read_csv('moores_law.csv', header=None)
train_labels = np.log(df[0].values)
train_feature = df[1].values

Although not strictly necessary for a single feature input like in this example, it is always good practice to apply some normalization to your input features so we shall instantiate a normalizer here.

The `tf.keras.layers.Normalization` is a clean and simple way to add feature normalization into your model. The first step is to create the layer. Note the use of the axis key word here is important. From the documentation:

"...the axis or axes that should have a separate mean and variance for each index in the shape. For example, if shape is (None, 5) and axis=1, the layer will track 5 separate mean and variance values for the last axis. If axis is set to None, the layer will normalize all elements in the input by a scalar mean and variance. Defaults to -1, where the last axis of the input is assumed to be a feature dimension and is normalized per index. Note that in the specific case of batched scalar inputs where the only axis is the batch axis, the default will normalize each index in the batch separately. In this case, consider passing axis=None."

As we have a 1D array (a single feature vector) then it is prudent for us to pass in `axis=None`.

normalizer = tf.keras.layers.Normalization(axis=None)
normalizer.adapt(np.array(train_feature))
train_feature = np.array(train_feature)

normalizer = layers.Normalization(input_shape=[1,], axis=None)
normalizer.adapt(train_feature)

Next, it's time to define our model. We'll be using a sequential model from Keras in which we apply our normalization and send the result into a dense neural network layer with a single neuron. We know that from the previous topic this is equivalent to a linear regression.

linear_model = tf.keras.Sequential([
    normalizer,
    layers.Dense(units=1)
])

linear_model.summary()


            Model: "sequential_1"
            _________________________________________________________________
             Layer (type)                Output Shape              Param #   
            =================================================================
             normalization_5 (Normalizat  (None, 1)                3         
             ion)                                                            
                                                                             
             dense_1 (Dense)             (None, 1)                 2         
                                                                             
            =================================================================
            Total params: 5
            Trainable params: 2
            Non-trainable params: 3
            _________________________________________________________________

linear_model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=0.1),
loss='mean_absolute_error')

%time
history = linear_model.fit(
    train_feature,
    train_labels,
    epochs=100,
    # Suppress logging.
    verbose=0,
    # Calculate validation results on 20% of the training data.
    validation_split = 0.2)

def plot_loss(history):
    plt.plot(history.history['loss'], label='loss')
    plt.plot(history.history['val_loss'], label='val_loss')
    plt.ylim([0, 10])
    plt.xlabel('Epoch')
    plt.ylabel('Error')
    plt.legend()
    plt.grid(True)

plot_loss(history)

def plot_predictions(x, y, xpred, ypred, xlabel, ylabel):
    plt.figure()
    plt.plot(x, y, 'x', label='Data')
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    plt.title(f'{ylabel} Vs {xlabel}')
    plt.plot(xpred, ypred, 'r--', label='Prediction')
    plt.legend()

maxf = max(train_feature)
minf = min(train_feature)
xpred = tf.linspace(minf, maxf, int(maxf-minf)+1)
ypred = linear_model.predict(xpred)

plot_predictions(train_feature, train_labels, xpred, ypred, '', '')

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.