House Finder¶
M. Fawcett - 08/28/2021
This Python program is meant to identify houses of a certain type using images found with Google Street View and a neural network model trained to recognize house styles.
One style of house is the "Craftsman Style". Here is an example...
For more information about the Craftsman Style, see this article... https://www.antiquehomesmagazine.com/historic-style-guide/craftsman/
# !pip install tensorflow
# !pip install keras
%matplotlib inline
# Load packages I'll be using
import os.path # functions involving path names
import glob # python global package for finding file names
import matplotlib.pyplot as plt # plots and graphs
import seaborn as sns # nicer colors and options for plots
import keras # artificial neural network construction using tensorflow
import tensorflow as tf
# from tensorflow.keras import layers
# from keras.models import Sequential
from keras.layers import Dense, Conv2D , MaxPool2D , Flatten , Dropout
from keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
from sklearn.metrics import classification_report,confusion_matrix
import numpy as np # for math & stuff
import csv # for reading/writing csv files
import urllib.parse # for building url strings
import urllib.request # form making url requests
# The next three are for resizing URL images in memory without saving to disk file first
from PIL import Image
import requests
from io import BytesIO
House pictures¶
Find random pictures of craft houses on the internet and save them in a folder named ImagesCraft. Find pictures of houses that are not craft style and save them in a folder called ImagesNotCraft. Both folders are in a parent folder named HouseImages.
Filter out bad files¶
I only want "jpg" files.
import os
num_skipped = 0
for folder_name in ("ImagesCraft", "ImagesNotCraft"):
folder_path = os.path.join("HouseImages", folder_name)
for fname in os.listdir(folder_path):
fpath = os.path.join(folder_path, fname)
try:
fobj = open(fpath, "rb")
is_jfif = tf.compat.as_bytes("JFIF") in fobj.peek(10)
finally:
fobj.close()
if not is_jfif:
num_skipped += 1
# Delete corrupted image
os.remove(fpath)
print("Deleted %d images" % num_skipped)
Deleted 1 images
# Define a function to set up the training and validation datasets.
# Cite: https://keras.io/examples/vision/image_classification_from_scratch/
# Keras does things that are helpful but not always clear in the code. The two classes of houses (craft and not craft)
# are stored in separate folders under the HouseImages folder. Keras picks one folder to be class "0" and
# the other folder as class "1". The class label is automatically attached to each image based on what folder
# it was in. The datasets this code generates are "tensorflow" datasets.
image_size = (180, 180)
batch_size = 16
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
"HouseImages",
validation_split=0.2,
subset="training",
seed=1337,
image_size=image_size,
batch_size=batch_size,
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
"HouseImages",
validation_split=0.2,
subset="validation",
seed=1337,
image_size=image_size,
batch_size=batch_size,
)
Found 183 files belonging to 2 classes. Using 147 files for training. Found 183 files belonging to 2 classes. Using 36 files for validation.
Visualize some of the data¶
plt.figure(figsize=(6, 6))
for images, labels in train_ds.take(1):
for i in range(2):
ax = plt.subplot(1, 2, i + 1)
# plt.imshow(tf.squeeze(images[i].numpy().astype("uint8")), cmap = 'gray')
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(int(labels[i]))
plt.axis("off")
# Note, grayscale images should have shape(h, w) rather than (h, w, 1). You can use
# squeeze() to eliminate the third dimension: plt.imshow(data.squeeze())
# Would use the following code in that case:
# plt.imshow(tf.squeeze(images[i].numpy().astype("uint8")), cmap = 'gray')
# Cite: https://stackoverflow.com/questions/2659312/how-do-i-convert-a-numpy-array-to-and-display-an-image
Augment the data¶
Additional training samples can be achieved by horizontal flipping and slightly angling random house pictures
data_augmentation = keras.Sequential(
[
layers.RandomFlip("horizontal"),
layers.RandomRotation(0.05),
]
)
# What this looks like when one image is augmented repeatedly
plt.figure(figsize=(10, 10))
for images, _ in train_ds.take(1):
for i in range(9):
augmented_images = data_augmentation(images)
ax = plt.subplot(3, 3, i + 1)
plt.imshow(augmented_images[0].numpy().astype("uint8"))
plt.axis("off")
Build model¶
Main Cite: https://keras.io/examples/vision/image_classification_from_scratch/
Also cite: https://www.analyticsvidhya.com/blog/2020/10/create-image-classification-model-python-keras/
Build a small version of the Xception network. This is a state of the art (2017) convolutional neural network developed at Google that is good at image classification. See https://openaccess.thecvf.com/content_cvpr_2017/papers/Chollet_Xception_Deep_Learning_CVPR_2017_paper.pdf
def make_model(input_shape, num_classes):
inputs = keras.Input(shape = input_shape)
# Image augmentation block
x = data_augmentation(inputs)
# Entry block
x = layers.Rescaling(1.0 / 255)(x)
x = layers.Conv2D(32, 3, strides = 2, padding = "same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.Conv2D(64, 3, padding = "same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
previous_block_activation = x # Set aside residual
for size in [128, 256, 512, 728]:
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding = "same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding = "same")(x)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling2D(3, strides=2, padding = "same")(x)
# Project residual
residual = layers.Conv2D(size, 1, strides = 2, padding = "same")(
previous_block_activation
)
x = layers.add([x, residual]) # Add back residual
previous_block_activation = x # Set aside next residual
x = layers.SeparableConv2D(1024, 3, padding = "same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.GlobalAveragePooling2D()(x)
if num_classes == 2:
activation = "sigmoid"
units = 1
else:
activation = "softmax"
units = num_classes
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(units, activation=activation)(x)
return keras.Model(inputs, outputs)
model = make_model(input_shape=image_size + (3,), num_classes=2)
# keras.utils.plot_model(model, show_shapes=True)
Show the model's network structure¶
print(model.summary())
Model: "model_3" __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_4 (InputLayer) [(None, 180, 180, 3) 0 __________________________________________________________________________________________________ sequential_1 (Sequential) (None, 180, 180, 3) 0 input_4[0][0] __________________________________________________________________________________________________ rescaling_3 (Rescaling) (None, 180, 180, 3) 0 sequential_1[1][0] __________________________________________________________________________________________________ conv2d_18 (Conv2D) (None, 90, 90, 32) 896 rescaling_3[0][0] __________________________________________________________________________________________________ batch_normalization_33 (BatchNo (None, 90, 90, 32) 128 conv2d_18[0][0] __________________________________________________________________________________________________ activation_33 (Activation) (None, 90, 90, 32) 0 batch_normalization_33[0][0] __________________________________________________________________________________________________ conv2d_19 (Conv2D) (None, 90, 90, 64) 18496 activation_33[0][0] __________________________________________________________________________________________________ batch_normalization_34 (BatchNo (None, 90, 90, 64) 256 conv2d_19[0][0] __________________________________________________________________________________________________ activation_34 (Activation) (None, 90, 90, 64) 0 batch_normalization_34[0][0] __________________________________________________________________________________________________ activation_35 (Activation) (None, 90, 90, 64) 0 activation_34[0][0] __________________________________________________________________________________________________ separable_conv2d_27 (SeparableC (None, 90, 90, 128) 8896 activation_35[0][0] __________________________________________________________________________________________________ batch_normalization_35 (BatchNo (None, 90, 90, 128) 512 separable_conv2d_27[0][0] __________________________________________________________________________________________________ activation_36 (Activation) (None, 90, 90, 128) 0 batch_normalization_35[0][0] __________________________________________________________________________________________________ separable_conv2d_28 (SeparableC (None, 90, 90, 128) 17664 activation_36[0][0] __________________________________________________________________________________________________ batch_normalization_36 (BatchNo (None, 90, 90, 128) 512 separable_conv2d_28[0][0] __________________________________________________________________________________________________ max_pooling2d_12 (MaxPooling2D) (None, 45, 45, 128) 0 batch_normalization_36[0][0] __________________________________________________________________________________________________ conv2d_20 (Conv2D) (None, 45, 45, 128) 8320 activation_34[0][0] __________________________________________________________________________________________________ add_12 (Add) (None, 45, 45, 128) 0 max_pooling2d_12[0][0] conv2d_20[0][0] __________________________________________________________________________________________________ activation_37 (Activation) (None, 45, 45, 128) 0 add_12[0][0] __________________________________________________________________________________________________ separable_conv2d_29 (SeparableC (None, 45, 45, 256) 34176 activation_37[0][0] __________________________________________________________________________________________________ batch_normalization_37 (BatchNo (None, 45, 45, 256) 1024 separable_conv2d_29[0][0] __________________________________________________________________________________________________ activation_38 (Activation) (None, 45, 45, 256) 0 batch_normalization_37[0][0] __________________________________________________________________________________________________ separable_conv2d_30 (SeparableC (None, 45, 45, 256) 68096 activation_38[0][0] __________________________________________________________________________________________________ batch_normalization_38 (BatchNo (None, 45, 45, 256) 1024 separable_conv2d_30[0][0] __________________________________________________________________________________________________ max_pooling2d_13 (MaxPooling2D) (None, 23, 23, 256) 0 batch_normalization_38[0][0] __________________________________________________________________________________________________ conv2d_21 (Conv2D) (None, 23, 23, 256) 33024 add_12[0][0] __________________________________________________________________________________________________ add_13 (Add) (None, 23, 23, 256) 0 max_pooling2d_13[0][0] conv2d_21[0][0] __________________________________________________________________________________________________ activation_39 (Activation) (None, 23, 23, 256) 0 add_13[0][0] __________________________________________________________________________________________________ separable_conv2d_31 (SeparableC (None, 23, 23, 512) 133888 activation_39[0][0] __________________________________________________________________________________________________ batch_normalization_39 (BatchNo (None, 23, 23, 512) 2048 separable_conv2d_31[0][0] __________________________________________________________________________________________________ activation_40 (Activation) (None, 23, 23, 512) 0 batch_normalization_39[0][0] __________________________________________________________________________________________________ separable_conv2d_32 (SeparableC (None, 23, 23, 512) 267264 activation_40[0][0] __________________________________________________________________________________________________ batch_normalization_40 (BatchNo (None, 23, 23, 512) 2048 separable_conv2d_32[0][0] __________________________________________________________________________________________________ max_pooling2d_14 (MaxPooling2D) (None, 12, 12, 512) 0 batch_normalization_40[0][0] __________________________________________________________________________________________________ conv2d_22 (Conv2D) (None, 12, 12, 512) 131584 add_13[0][0] __________________________________________________________________________________________________ add_14 (Add) (None, 12, 12, 512) 0 max_pooling2d_14[0][0] conv2d_22[0][0] __________________________________________________________________________________________________ activation_41 (Activation) (None, 12, 12, 512) 0 add_14[0][0] __________________________________________________________________________________________________ separable_conv2d_33 (SeparableC (None, 12, 12, 728) 378072 activation_41[0][0] __________________________________________________________________________________________________ batch_normalization_41 (BatchNo (None, 12, 12, 728) 2912 separable_conv2d_33[0][0] __________________________________________________________________________________________________ activation_42 (Activation) (None, 12, 12, 728) 0 batch_normalization_41[0][0] __________________________________________________________________________________________________ separable_conv2d_34 (SeparableC (None, 12, 12, 728) 537264 activation_42[0][0] __________________________________________________________________________________________________ batch_normalization_42 (BatchNo (None, 12, 12, 728) 2912 separable_conv2d_34[0][0] __________________________________________________________________________________________________ max_pooling2d_15 (MaxPooling2D) (None, 6, 6, 728) 0 batch_normalization_42[0][0] __________________________________________________________________________________________________ conv2d_23 (Conv2D) (None, 6, 6, 728) 373464 add_14[0][0] __________________________________________________________________________________________________ add_15 (Add) (None, 6, 6, 728) 0 max_pooling2d_15[0][0] conv2d_23[0][0] __________________________________________________________________________________________________ separable_conv2d_35 (SeparableC (None, 6, 6, 1024) 753048 add_15[0][0] __________________________________________________________________________________________________ batch_normalization_43 (BatchNo (None, 6, 6, 1024) 4096 separable_conv2d_35[0][0] __________________________________________________________________________________________________ activation_43 (Activation) (None, 6, 6, 1024) 0 batch_normalization_43[0][0] __________________________________________________________________________________________________ global_average_pooling2d_3 (Glo (None, 1024) 0 activation_43[0][0] __________________________________________________________________________________________________ dropout_3 (Dropout) (None, 1024) 0 global_average_pooling2d_3[0][0] __________________________________________________________________________________________________ dense_3 (Dense) (None, 1) 1025 dropout_3[0][0] ================================================================================================== Total params: 2,782,649 Trainable params: 2,773,913 Non-trainable params: 8,736 __________________________________________________________________________________________________ None
Train the model¶
Only save the best model as determined by which one with the lowest validation loss (val_loss).
epochs = 50
# Cite: https://towardsdatascience.com/keras-callbacks-and-how-to-save-your-model-from-overtraining-244fc1de8608
filepath = "my_best_model.epoch{epoch:02d}-loss{val_loss:.2f}.h5"
checkpoint = keras.callbacks.ModelCheckpoint(filepath = filepath,
monitor = "val_loss",
verbose = 1,
save_best_only = True,
mode = "min")
callbacks = [checkpoint]
model.compile(
# optimizer=keras.optimizers.Adam(1e-3),
optimizer = Adam(1e-3),
loss = "binary_crossentropy",
metrics = ["accuracy"],
)
model.fit(
train_ds, epochs = epochs, callbacks = callbacks, validation_data = val_ds,
)
Epoch 1/50 10/10 [==============================] - 15s 1s/step - loss: 0.6990 - accuracy: 0.5850 - val_loss: 0.6713 - val_accuracy: 0.7222 Epoch 00001: val_loss improved from inf to 0.67128, saving model to my_best_model.epoch01-loss0.67.h5 Epoch 2/50 10/10 [==============================] - 14s 1s/step - loss: 0.6048 - accuracy: 0.7211 - val_loss: 0.6500 - val_accuracy: 0.7222 Epoch 00002: val_loss improved from 0.67128 to 0.64998, saving model to my_best_model.epoch02-loss0.65.h5 Epoch 3/50 10/10 [==============================] - 15s 2s/step - loss: 0.5924 - accuracy: 0.6939 - val_loss: 0.6216 - val_accuracy: 0.7222 Epoch 00003: val_loss improved from 0.64998 to 0.62157, saving model to my_best_model.epoch03-loss0.62.h5 Epoch 4/50 10/10 [==============================] - 15s 2s/step - loss: 0.4479 - accuracy: 0.7279 - val_loss: 0.5976 - val_accuracy: 0.7222 Epoch 00004: val_loss improved from 0.62157 to 0.59763, saving model to my_best_model.epoch04-loss0.60.h5 Epoch 5/50 10/10 [==============================] - 14s 1s/step - loss: 0.4453 - accuracy: 0.7619 - val_loss: 0.5897 - val_accuracy: 0.7222 Epoch 00005: val_loss improved from 0.59763 to 0.58968, saving model to my_best_model.epoch05-loss0.59.h5 Epoch 6/50 10/10 [==============================] - 14s 1s/step - loss: 0.3848 - accuracy: 0.8163 - val_loss: 0.5896 - val_accuracy: 0.7222 Epoch 00006: val_loss improved from 0.58968 to 0.58957, saving model to my_best_model.epoch06-loss0.59.h5 Epoch 7/50 10/10 [==============================] - 14s 1s/step - loss: 0.2856 - accuracy: 0.8639 - val_loss: 0.6076 - val_accuracy: 0.7222 Epoch 00007: val_loss did not improve from 0.58957 Epoch 8/50 10/10 [==============================] - 14s 1s/step - loss: 0.2758 - accuracy: 0.8707 - val_loss: 0.6863 - val_accuracy: 0.7222 Epoch 00008: val_loss did not improve from 0.58957 Epoch 9/50 10/10 [==============================] - 14s 1s/step - loss: 0.3781 - accuracy: 0.8435 - val_loss: 0.8023 - val_accuracy: 0.7222 Epoch 00009: val_loss did not improve from 0.58957 Epoch 10/50 10/10 [==============================] - 13s 1s/step - loss: 0.2794 - accuracy: 0.8571 - val_loss: 0.9379 - val_accuracy: 0.7222 Epoch 00010: val_loss did not improve from 0.58957 Epoch 11/50 10/10 [==============================] - 13s 1s/step - loss: 0.2710 - accuracy: 0.8912 - val_loss: 1.0919 - val_accuracy: 0.7222 Epoch 00011: val_loss did not improve from 0.58957 Epoch 12/50 10/10 [==============================] - 14s 1s/step - loss: 0.2233 - accuracy: 0.9116 - val_loss: 1.2810 - val_accuracy: 0.7222 Epoch 00012: val_loss did not improve from 0.58957 Epoch 13/50 10/10 [==============================] - 13s 1s/step - loss: 0.2265 - accuracy: 0.9252 - val_loss: 1.5134 - val_accuracy: 0.7222 Epoch 00013: val_loss did not improve from 0.58957 Epoch 14/50 10/10 [==============================] - 13s 1s/step - loss: 0.2218 - accuracy: 0.9184 - val_loss: 1.6726 - val_accuracy: 0.7222 Epoch 00014: val_loss did not improve from 0.58957 Epoch 15/50 10/10 [==============================] - 13s 1s/step - loss: 0.1610 - accuracy: 0.9320 - val_loss: 1.8686 - val_accuracy: 0.7222 Epoch 00015: val_loss did not improve from 0.58957 Epoch 16/50 10/10 [==============================] - 13s 1s/step - loss: 0.1316 - accuracy: 0.9456 - val_loss: 1.8904 - val_accuracy: 0.7222 Epoch 00016: val_loss did not improve from 0.58957 Epoch 17/50 10/10 [==============================] - 13s 1s/step - loss: 0.1769 - accuracy: 0.9116 - val_loss: 2.1286 - val_accuracy: 0.7222 Epoch 00017: val_loss did not improve from 0.58957 Epoch 18/50 10/10 [==============================] - 14s 1s/step - loss: 0.2401 - accuracy: 0.9116 - val_loss: 2.3214 - val_accuracy: 0.7222 Epoch 00018: val_loss did not improve from 0.58957 Epoch 19/50 10/10 [==============================] - 14s 1s/step - loss: 0.1453 - accuracy: 0.9388 - val_loss: 2.4610 - val_accuracy: 0.7222 Epoch 00019: val_loss did not improve from 0.58957 Epoch 20/50 10/10 [==============================] - 13s 1s/step - loss: 0.1574 - accuracy: 0.9320 - val_loss: 2.7796 - val_accuracy: 0.7222 Epoch 00020: val_loss did not improve from 0.58957 Epoch 21/50 10/10 [==============================] - 14s 1s/step - loss: 0.1696 - accuracy: 0.9184 - val_loss: 3.0349 - val_accuracy: 0.7222 Epoch 00021: val_loss did not improve from 0.58957 Epoch 22/50 10/10 [==============================] - 13s 1s/step - loss: 0.2469 - accuracy: 0.8776 - val_loss: 3.2799 - val_accuracy: 0.7222 Epoch 00022: val_loss did not improve from 0.58957 Epoch 23/50 10/10 [==============================] - 13s 1s/step - loss: 0.2398 - accuracy: 0.8980 - val_loss: 2.9001 - val_accuracy: 0.7222 Epoch 00023: val_loss did not improve from 0.58957 Epoch 24/50 10/10 [==============================] - 13s 1s/step - loss: 0.1641 - accuracy: 0.9524 - val_loss: 2.9844 - val_accuracy: 0.7222 Epoch 00024: val_loss did not improve from 0.58957 Epoch 25/50 10/10 [==============================] - 13s 1s/step - loss: 0.1765 - accuracy: 0.9524 - val_loss: 3.0796 - val_accuracy: 0.7222 Epoch 00025: val_loss did not improve from 0.58957 Epoch 26/50 10/10 [==============================] - 13s 1s/step - loss: 0.2350 - accuracy: 0.8980 - val_loss: 3.1324 - val_accuracy: 0.7222 Epoch 00026: val_loss did not improve from 0.58957 Epoch 27/50 10/10 [==============================] - 15s 2s/step - loss: 0.2110 - accuracy: 0.9184 - val_loss: 3.4736 - val_accuracy: 0.7222 Epoch 00027: val_loss did not improve from 0.58957 Epoch 28/50 10/10 [==============================] - 14s 1s/step - loss: 0.0869 - accuracy: 0.9796 - val_loss: 3.8772 - val_accuracy: 0.7222 Epoch 00028: val_loss did not improve from 0.58957 Epoch 29/50 10/10 [==============================] - 14s 1s/step - loss: 0.0953 - accuracy: 0.9796 - val_loss: 4.3301 - val_accuracy: 0.7222 Epoch 00029: val_loss did not improve from 0.58957 Epoch 30/50 10/10 [==============================] - 13s 1s/step - loss: 0.1063 - accuracy: 0.9592 - val_loss: 4.6673 - val_accuracy: 0.7222 Epoch 00030: val_loss did not improve from 0.58957 Epoch 31/50 10/10 [==============================] - 13s 1s/step - loss: 0.1173 - accuracy: 0.9388 - val_loss: 5.1409 - val_accuracy: 0.7222 Epoch 00031: val_loss did not improve from 0.58957 Epoch 32/50 10/10 [==============================] - 13s 1s/step - loss: 0.1044 - accuracy: 0.9660 - val_loss: 4.6022 - val_accuracy: 0.7222 Epoch 00032: val_loss did not improve from 0.58957 Epoch 33/50 10/10 [==============================] - 13s 1s/step - loss: 0.3750 - accuracy: 0.8231 - val_loss: 4.5677 - val_accuracy: 0.7222 Epoch 00033: val_loss did not improve from 0.58957 Epoch 34/50 10/10 [==============================] - 13s 1s/step - loss: 0.2030 - accuracy: 0.9320 - val_loss: 4.9343 - val_accuracy: 0.7222 Epoch 00034: val_loss did not improve from 0.58957 Epoch 35/50 10/10 [==============================] - 13s 1s/step - loss: 0.3314 - accuracy: 0.8707 - val_loss: 4.7082 - val_accuracy: 0.7222 Epoch 00035: val_loss did not improve from 0.58957 Epoch 36/50 10/10 [==============================] - 13s 1s/step - loss: 0.1931 - accuracy: 0.8912 - val_loss: 4.2556 - val_accuracy: 0.7222 Epoch 00036: val_loss did not improve from 0.58957 Epoch 37/50 10/10 [==============================] - 13s 1s/step - loss: 0.1662 - accuracy: 0.9320 - val_loss: 4.4755 - val_accuracy: 0.7222 Epoch 00037: val_loss did not improve from 0.58957 Epoch 38/50 10/10 [==============================] - 14s 1s/step - loss: 0.1646 - accuracy: 0.9252 - val_loss: 4.4374 - val_accuracy: 0.7222 Epoch 00038: val_loss did not improve from 0.58957 Epoch 39/50 10/10 [==============================] - 13s 1s/step - loss: 0.1949 - accuracy: 0.9252 - val_loss: 4.2070 - val_accuracy: 0.7222 Epoch 00039: val_loss did not improve from 0.58957 Epoch 40/50 10/10 [==============================] - 13s 1s/step - loss: 0.1314 - accuracy: 0.9524 - val_loss: 4.9786 - val_accuracy: 0.7222 Epoch 00040: val_loss did not improve from 0.58957 Epoch 41/50 10/10 [==============================] - 13s 1s/step - loss: 0.0508 - accuracy: 0.9864 - val_loss: 5.2428 - val_accuracy: 0.7222 Epoch 00041: val_loss did not improve from 0.58957 Epoch 42/50 10/10 [==============================] - 15s 2s/step - loss: 0.1131 - accuracy: 0.9592 - val_loss: 5.4165 - val_accuracy: 0.7222 Epoch 00042: val_loss did not improve from 0.58957 Epoch 43/50 10/10 [==============================] - 13s 1s/step - loss: 0.0754 - accuracy: 0.9728 - val_loss: 5.4126 - val_accuracy: 0.7222 Epoch 00043: val_loss did not improve from 0.58957 Epoch 44/50 10/10 [==============================] - 13s 1s/step - loss: 0.0707 - accuracy: 0.9796 - val_loss: 5.7698 - val_accuracy: 0.7222 Epoch 00044: val_loss did not improve from 0.58957 Epoch 45/50 10/10 [==============================] - 13s 1s/step - loss: 0.1023 - accuracy: 0.9592 - val_loss: 5.7184 - val_accuracy: 0.7222 Epoch 00045: val_loss did not improve from 0.58957 Epoch 46/50 10/10 [==============================] - 13s 1s/step - loss: 0.0527 - accuracy: 0.9864 - val_loss: 5.7026 - val_accuracy: 0.7222 Epoch 00046: val_loss did not improve from 0.58957 Epoch 47/50 10/10 [==============================] - 14s 1s/step - loss: 0.2025 - accuracy: 0.9116 - val_loss: 4.4318 - val_accuracy: 0.7222 Epoch 00047: val_loss did not improve from 0.58957 Epoch 48/50 10/10 [==============================] - 14s 1s/step - loss: 0.2133 - accuracy: 0.9252 - val_loss: 4.3643 - val_accuracy: 0.7222 Epoch 00048: val_loss did not improve from 0.58957 Epoch 49/50 10/10 [==============================] - 13s 1s/step - loss: 0.1676 - accuracy: 0.9388 - val_loss: 3.9838 - val_accuracy: 0.7222 Epoch 00049: val_loss did not improve from 0.58957 Epoch 50/50 10/10 [==============================] - 13s 1s/step - loss: 0.1504 - accuracy: 0.9456 - val_loss: 4.8606 - val_accuracy: 0.7222 Epoch 00050: val_loss did not improve from 0.58957
<keras.callbacks.History at 0x7fe7bd33c080>
Test the model¶
Use the best model (the one with the lowest validation loss) to classify some previously unseen houses. The test images are stored in a folder called "TestImages". Models are saved in the working directory.
# Remove non-jpg files from the test folder. Mainly this is used to delete a
# hidden .DS_Store file
num_skipped = 0
for fname in os.listdir("TestImages"):
fpath = os.path.join("TestImages", fname)
try:
fobj = open(fpath, "rb")
is_jfif = tf.compat.as_bytes("JFIF") in fobj.peek(10)
finally:
fobj.close()
if not is_jfif:
num_skipped += 1
# Delete corrupted image
os.remove(fpath)
print("Deleted %d images" % num_skipped)
Deleted 0 images
Evaluate the test images¶
The testing is done on images that are stored in jpg files on disk. For evaluations done while searching through large swaths of addresses, downloaded images will be kept in memory and only saved to disk if they exceed the scoring threshold for arts and crafts architecture.
# Load the best model
best_model_filepath = "my_best_model.epoch32-loss0.48.h5"
model = keras.models.load_model(best_model_filepath)
# Or just use the last model created during fit.
# model = model
path = "TestImages/"
icount = len(os.listdir(path)) # number of test images
j = 0 # a counter for the subplot position
plt.figure(figsize=(40, 40))
for picfile in os.listdir(path):
try:
ax = plt.subplot(icount, 1, j + 1)
img = keras.preprocessing.image.load_img(
os.path.join("TestImages/", picfile), target_size=image_size
)
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) # Create batch axis
# Note, "predictions = model.predict(img_array)" threw an error
# Cite: https://www.py4u.net/discuss/246141
# Used the following instead:
predictions = model.predict(img_array)
score = predictions[0]
plt.title(picfile + ": %.2f percent Craft and %.2f percent NotCraft."
% (100 * (1 - score), 100 * score))
plt.imshow(img)
plt.axis("off")
j += 1
except Exception as e:
print(e)
Addresses¶
My source of addresses is the National Address Database. Only about 20 states participate however. The particular addresses I will be scanning are in a csv file in the working directory of this program. The list of addresses was prepared in another Python program of mine called Address List Builder.
Google Street View API Key¶
Need a key in order to request Street View images. Screen showing the Google APIs I currently have permission to use: https://console.cloud.google.com/google/maps-apis/api-list?project=delaware-pizza
Evaluate Image¶
Define the function that evaluates a street view image using the CNN model. If the evaluation score exceeds the threshold for Arts and Crafts architecture, save it to disk and record the Google Maps location.
myloc = "FoundImages" # Where to store the possible Arts and Crafts images
def EvaluateImage(stviewimg, addr):
# stviewimg = the Google Street View image of a location
# addr = the postal address of the location
# Resize image to the input size expected by the model.
image = stviewimg.resize((180, 180), Image.ANTIALIAS)
# Convert image to tensor array expected by model
img_array = keras.preprocessing.image.img_to_array(image)
# Create batch axis
img_array = tf.expand_dims(img_array, 0)
# Evaluate the image via model
predictions = model.predict(img_array)
# Extract the score value from the result
score = predictions[0]
# Save to disk the locations that appear to be Arts and Crafts
if 1.0 - score > .50: # "score" is the percent that is "not craft", so we want a larger 1 - score value.
# Save the picture if it could be a craft house.
fname = addr + ".jpg"
print(fname)
# Display the evaluation result
print("%.2f percent Craft and %.2f percent NotCraft." % (100 * (1 - score), 100 * score))
output = BytesIO()
# save the original street view image (not the rescaled version)
stviewimg.save(os.path.join(myloc, fname), format = "JPEG", optimize=True, quality = "maximum")
# Cite: https://stackoverflow.com/questions/60933666/how-can-i-resize-image-with
# -quality-without-saving-image-in-python
Retrieve Street View Image¶
Define the function that retrieves a Street View image for an address.
key = "&key=" + "<your api key> # will get denied with no API key
def GetStreetImage(Addr):
# Addr = the address for Google Street View lookup
base = "https://maps.googleapis.com/maps/api/streetview?size=1200x800&location="
MyUrl = base + urllib.parse.quote_plus(Addr) + key #added url encoding
# print(MyUrl)
response = requests.get(MyUrl)
img = Image.open(BytesIO(response.content))
# Evaluate the image using the model
EvaluateImage(stviewimg = img, addr = Addr)
# Create a process that reads a row from an address file, formats the address string and gets the
# static Street View picture of it.
# Load best model
best_model_filepath = "my_best_model.epoch23-loss0.20.h5"
model = keras.models.load_model(best_model_filepath)
j = 0 # a loop counter
maxj = 15000 # limit the number of addresses being searched
previousaddress = ""
with open("NAD_r7_NewYork_13905.csv", 'r') as infile:
reader = csv.reader(infile)
next(reader)
for row in reader:
# try:
address = row[20] + " " + row[15] + " " + row[16] + ", " + row[6] + ", " + row[1] + " " + row[7]
# Address = Street Name Pre-Directional + Address Number + Street name + City + State + Zip
# writer.writerow(row)
if previousaddress != address: # For whatever reason, the same address can repeat multiple times
# in the NAD.
previousaddress = address
if j % 100 == 0:
print(j)
# Get the image and then evaluate it
GetStreetImage(Addr = address)
j += 1
if j >= maxj:
break
# except Exception as e:
# print(e)
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 138 Front Street, Binghamton, NY 13905.jpg 65.43 percent Craft and 34.57 percent NotCraft. 140 Front Street, Binghamton, NY 13905.jpg 57.35 percent Craft and 42.65 percent NotCraft. 2200 2300 109 Oak Street, Binghamton, NY 13905.jpg 65.47 percent Craft and 34.53 percent NotCraft. 2400 6 Arthur Street, Binghamton, NY 13905.jpg 54.55 percent Craft and 45.45 percent NotCraft. 6 King Avenue, Binghamton, NY 13905.jpg 87.41 percent Craft and 12.59 percent NotCraft. 2500 2600 2700 2800 2900 57 Murray Street, Binghamton, NY 13905.jpg 66.92 percent Craft and 33.08 percent NotCraft. 3000 3100 3200 3300 3400 3500 48 Cleveland Avenue, Binghamton, NY 13905.jpg 81.87 percent Craft and 18.13 percent NotCraft. 3600 3700 3800 3900 4000 4100 10 Edgecomb Road, Binghamton, NY 13905.jpg 77.52 percent Craft and 22.48 percent NotCraft. 4200 4300 4400 4500 4600 4700 117 Seminary Avenue, Binghamton, NY 13905.jpg 60.18 percent Craft and 39.82 percent NotCraft. 4800 4900 5000 5100 5200 5300 5400 5500 5600 5700 5800 5900 6000 6100 151 Main Street, Binghamton, NY 13905.jpg 54.28 percent Craft and 45.72 percent NotCraft. 6200 6300 6400 62 North Street, Binghamton, NY 13905.jpg 51.45 percent Craft and 48.55 percent NotCraft. 46 North Street, Binghamton, NY 13905.jpg 59.64 percent Craft and 40.36 percent NotCraft. 46 North Street, Binghamton, NY 13905.jpg 59.64 percent Craft and 40.36 percent NotCraft. 6500 165 Clinton Street, Binghamton, NY 13905.jpg 80.97 percent Craft and 19.03 percent NotCraft. 167 Clinton Street, Binghamton, NY 13905.jpg 52.70 percent Craft and 47.30 percent NotCraft. 6600 6700 6800 27 Winding Way, Binghamton, NY 13905.jpg 59.70 percent Craft and 40.30 percent NotCraft. 35 Winding Way, Binghamton, NY 13905.jpg 66.55 percent Craft and 33.45 percent NotCraft. 6900 34 Lydia Street, Binghamton, NY 13905.jpg 77.57 percent Craft and 22.43 percent NotCraft. 7000 7100 7200 7300 7400 7500 7600 7700 25 Hazel Street, Binghamton, NY 13905.jpg 88.85 percent Craft and 11.15 percent NotCraft. 26 Hazel Street, Binghamton, NY 13905.jpg 54.89 percent Craft and 45.11 percent NotCraft. 7800 7900 8000 8100 8200 8300 400 Prospect Street, Binghamton, NY 13905.jpg 88.23 percent Craft and 11.77 percent NotCraft. 8400 8500 8600