Tensorflow with Estimators

As we saw previously how to build a full Multi-Layer Perceptron model with full Sessions in Tensorflow. Unfortunately this was an extremely involved process. However developers have created Estimators that have an easier to use flow!

It is much easier to use, but you sacrifice some level of customization of your model. Let's go ahead and explore it!

Get the Data

We will the iris data set.

Let's get the data:

In [1]:
import pandas as pd
In [2]:
df = pd.read_csv('iris.csv')
In [3]:
df.head()
Out[3]:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target
0 5.1 3.5 1.4 0.2 0.0
1 4.9 3.0 1.4 0.2 0.0
2 4.7 3.2 1.3 0.2 0.0
3 4.6 3.1 1.5 0.2 0.0
4 5.0 3.6 1.4 0.2 0.0
In [4]:
df.columns = ['sepal_length','sepal_width','petal_length','petal_width','target']
In [5]:
X = df.drop('target',axis=1)
y = df['target'].apply(int)

Train Test Split

In [6]:
from sklearn.model_selection import train_test_split
In [7]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

Estimators

Let's show you how to use the simpler Estimator interface!

In [8]:
import tensorflow as tf
C:\Users\Marcial\Anaconda3\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters

Feature Columns

In [9]:
X.columns
Out[9]:
Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], dtype='object')
In [10]:
feat_cols = []

for col in X.columns:
    feat_cols.append(tf.feature_column.numeric_column(col))
In [11]:
feat_cols
Out[11]:
[_NumericColumn(key='sepal_length', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 _NumericColumn(key='sepal_width', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 _NumericColumn(key='petal_length', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 _NumericColumn(key='petal_width', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)]

Input Function

In [12]:
# there is also a pandas_input_fn we'll see in the exercise!!
input_func = tf.estimator.inputs.pandas_input_fn(x=X_train,y=y_train,batch_size=10,num_epochs=5,shuffle=True)
In [13]:
classifier = tf.estimator.DNNClassifier(hidden_units=[10, 20, 10], n_classes=3,feature_columns=feat_cols)
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: C:\Users\Marcial\AppData\Local\Temp\tmp3_l8l99d
INFO:tensorflow:Using config: {'_model_dir': 'C:\\Users\\Marcial\\AppData\\Local\\Temp\\tmp3_l8l99d', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x0000020ED5FA2390>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
In [14]:
classifier.train(input_fn=input_func,steps=50)
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into C:\Users\Marcial\AppData\Local\Temp\tmp3_l8l99d\model.ckpt.
INFO:tensorflow:loss = 15.285385, step = 1
INFO:tensorflow:Saving checkpoints for 50 into C:\Users\Marcial\AppData\Local\Temp\tmp3_l8l99d\model.ckpt.
INFO:tensorflow:Loss for final step: 3.4342575.
Out[14]:
<tensorflow.python.estimator.canned.dnn.DNNClassifier at 0x20edb48f748>

Model Evaluation

Use the predict method from the classifier model to create predictions from X_test

In [15]:
pred_fn = tf.estimator.inputs.pandas_input_fn(x=X_test,batch_size=len(X_test),shuffle=False)
In [16]:
note_predictions = list(classifier.predict(input_fn=pred_fn))
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\Marcial\AppData\Local\Temp\tmp3_l8l99d\model.ckpt-50
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
In [17]:
note_predictions[0]
Out[17]:
{'class_ids': array([2], dtype=int64),
 'classes': array([b'2'], dtype=object),
 'logits': array([-3.6269774 ,  0.16824062,  1.2134217 ], dtype=float32),
 'probabilities': array([0.00581369, 0.2586391 , 0.7355472 ], dtype=float32)}
In [18]:
final_preds  = []
for pred in note_predictions:
    final_preds.append(pred['class_ids'][0])

Now create a classification report and a Confusion Matrix. Does anything stand out to you?

In [19]:
from sklearn.metrics import classification_report,confusion_matrix
In [20]:
print(confusion_matrix(y_test,final_preds))
[[20  0  0]
 [ 0  6  0]
 [ 0  0 19]]
In [21]:
print(classification_report(y_test,final_preds))
             precision    recall  f1-score   support

          0       1.00      1.00      1.00        20
          1       1.00      1.00      1.00         6
          2       1.00      1.00      1.00        19

avg / total       1.00      1.00      1.00        45

Great Job!