from IPython.display import HTML # for hiding code cells
HTML('''<script>
code_show=true;
function code_toggle() {
if (code_show){
$('div.input').hide();
} else {
$('div.input').show();
}
code_show = !code_show
}
$( document ).ready(code_toggle);
</script>
The computer code for this report is by default hidden for easier reading.
To toggle the code on and off, click <a href="javascript:code_toggle()">here</a>.''')
M. Fawcett - December 23, 2021
rev. 01/21/2022
Between 1908 and 1942, the Sears Roebuck company sold houses in the form of build-it-yourself kits. Huge kits were prepared at factories, mostly in Illinois, and shipped in railroad freight cars to customers all over the country. Each kit contained all the materials needed to build a house. Customers lugged their 25 tons of numbered precut lumber, shingles, wall board, flooring and so on, from the freight car to their building site, and got to work following the instructions in the 75 page construction guide. Happy buyers considered the homes attractive, well designed and economical, about 30 percent less costly than a similar, existing home.
For me personally, this business model seems crazy and impractical, but Sears wasn't the only one selling kit homes. They were the most successful however, selling between 75,000 and 100,000 of them. Because Sears was a mail-order marketer of all manner of merchandise, they also sold the tools used to build the houses, and then the appliances, furniture and fixtures that filled them when they were completed. More than one observer has called Sears the Amazon of its time.
But success got cut short when the Great Depression and World War II took away much of the demand for new housing. After World War II a new trend in housing, tract housing, took over and the Sears kit home business faded into oblivion. The Sears company itself has lately been fading into oblivion. Having declared bankruptcy in 2018, Sears barely exists at all now except in legal proceedings while its few remaining stores are gradually liquidated.
With no official list of where the kit homes were built, a few fascinated enthusiasts now hunt for them and share their discoveries through social media and Websites. They find the story of these homes so appealing that they don't want to see them lost to history.
This report describes where kit homes have been found and offers clues as to where others are likely to be be found. It will look at things like street name, distance from railroads, local economic factors and population characteristics.
This report was written in the Python computer language. (There is a link at the top of the report for hiding or showing the computer code.) Another software program called QGIS was used to prepare some of the data displayed in the maps. The US Census Bureau provided neighborhood social and economic data.
This report confirms some fan beliefs about where to find Sears kit homes. Based on the analysis of kit homes in Ohio, I found,
Education level and population age do not seem to be associated with areas havong greater numbers of kit homes.
Below there are a couple of interactive maps that let you zoom into a view of the US to see where kit homes are located. With a few extra mouse clicks you can see a picture of the home in Google Street View.
My original goal when starting this project was to create a computer program that could analyze a picture of a house and determine if it was a Sears kit home. This turned out to be too hard a problem due to the large number of different looking models (over 350) produced over the years. Another goal was to have the computer program methodically "crawl" through Google Street View images and pick out houses that had a high probability of being a Sears kit home. This turned out to be impractical because Google charges 7/10ths of a cent every time a computer program uses Street View to capture an image. The bill for scanning 10,000 images was \$70.00. Scanning 1 million images would have cost \\$7,000.00.
%%capture --no-display
# Previous line supresses a warning about package versioning
# Load Python modules needed for the analysis
import pandas as pd # for dataframe manipulation
import numpy as np # for numerical analysis
import matplotlib.pyplot as plt # for generating plots and graphs
from matplotlib.pyplot import figure # for modifying appearance of plots & graphs
import requests # to make http post requests to the US Census geocoder
import io # for working with I/O streams and allow conversion of geocode response to dataframe
import csv # reading/writing csv files
import pickle as pk # to store and retrieve dataframes on disk
import csv # to read text files
import requests # to make http requests for data using census web API
import os # to list contents of disk drive folders
import sys # for managing system options
import folium # the mapping package
from folium import plugins # to allow cluster markers on maps
import seaborn as sns # for fancy plotting
from IPython.display import Markdown as md # for embedding variables in markdown cells
from IPython.display import Image # to embeg jpg images in notebook
# See http://blog.nextgenetics.net/?e=102 for hiding computer code in this report.
# For internal links table of contents see...
# https://nbviewer.org/github/rasbt/python_reference/blob/master/tutorials/table_of_contents_ipython.ipynb#top
# Settings to improve the display of tabular results
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
np.set_printoptions(threshold = sys.maxsize)
HTML('''<script>
code_show=true;
function code_toggle() {
if (code_show){
$('div.input').hide();
} else {
$('div.input').show();
}
code_show = !code_show
}
$( document ).ready(code_toggle);
</script>
The computer code for this report is by default hidden for easier reading.
To toggle the code on and off, click <a href="javascript:code_toggle()">here</a>.''')
A spreadsheet (maintained by LS) of around 13,000 kit home locations is the basis of this analysis. The list of locations is as of October 27,2021.
The first step was to load the list of kit home addresses from the spreadsheet into computer memory and do some minor tidying of the data. Overall the spreadsheet data is pretty clean. The values in each column were entered in a consistent way so there was not a need to do much cleanup to make good use of the data.
The main cleanup was in a column called "Auth?". I changed the name to "Auth" and all the values in the column were converted to uppercase so they are either "YES" or "NO". The Auth column indicates whether a location has gone through a confirmation step showing there is evidence that it is truly a Sears kit home.
Below is a sample of the data from the spreadsheet after the cleanup.
None of the cleanup effected the original spreadsheet. All of this work is done on an in-memory copy of the spreadsheet data.
# Read the Excel file of kit home locations into a Pandas dataframe.
address_df = pd.read_excel(r"Sears Roebuck Houses.xlsx", sheet_name = "Locations")
# Add a row number to each address. A unique number for each row will be needed by the
# US Census Bureau geocoder
address_df.insert(loc=0, column='row_num', value=np.arange(len(address_df)) + 2)
# the +2 is to add 2 to each row number to account for the header row and row "0".
# I want the row_num value to be aligned wth the row number in the original Excel file.
# Remove the "?" from the Auth? column name.
address_df.rename(columns={"Auth?": "Auth"}, inplace = True)
# Tidy up the values in the "Auth" column
# Change the "nan" to "N/A".
address_df["Auth"] = address_df["Auth"].replace(np.nan, 'N/A', regex=True)
# Make all the values in the Auth? column uppercase
address_df["Auth"] = address_df["Auth"].apply(lambda x: x.upper())
# Examine some of the cleaned up data
address_df.head()
# Total number of locations
tot = len(pd.unique(address_df["Address"]))
# Number authenticated
yes_n = len(address_df[address_df["Auth"] == 'YES'])
md("The total number of locations contained in the spreadsheet is {:,}. The number that have been authenticated is {:,}.".format(tot, yes_n))
As the chart below shows, Ohio has the most kit homes followed by Illinois, Pennsylvania and New York. Every state appears to have at least one kit home.
These counts include authenticated plus unauthenticated locations.
state_count = address_df['State'].value_counts()
# Plot a barchart
figure(figsize=(16, 6))
state_count.plot.bar()
plt.title("Number of Locations by State")
plt.show()
num_models = len(address_df['Model'].value_counts())
md("There are {} models mentioned in the spreadsheet. Some of these are variations " \
"of the same model name, for example Concord, Concord/No. 114, Concord/No. 3379.".format(num_models))
# List all the model names in alphabetical order.
models_df = pd.DataFrame(pd.unique(address_df["Model"])) # .astype(str).sort())
# models_df[0].sort_values()
Below is a list of the 50 most frequently mentioned model names in the spreadsheet and how many times each was mentioned.
model_count = address_df['Model'].value_counts().nlargest(50)
# Plot a barchart
figure(figsize=(10, 16))
model_count.plot.barh()
plt.title("Number of Kit Homes by Model Name (Top 50)")
plt.show()
Below are pictures of the three models that appear most frequently in the spreadsheet data.
|
|
|
Mapping the kit home locations requires having their longitude and latitude. The US Census Bureau provides a service called "Geocoding" for translating a mailing address into a pair of long/lat coordinates. It took around 20 minutes to process the list of 13,000 addresses using the Census Bureau service. There was no charge for this. Not every attempt at geocoding will result in an exact match, but in most cases here, it did.
Below is a sample of the spreadsheet data enhanced with the additional information provided by the geocoding. Scrolling the sample horizontally reveals the additional columns for longitude, latitude, state code, county code and census tract number and Zip Code. More about "census tracts" and "GEOIDs" later.
# Retrieve the coordinates and other results of the geocoding that were previously stored in a computer file.
geocoded_results_df = pd.read_pickle('geocoded_results.pkl')
# Only keep rows that were successfully geocoded
geocoded_results_df = geocoded_results_df[geocoded_results_df["MATCH_INDICATOR"] == "Match"]
# Convert geography code values from numeric to string
geocoded_results_df['FIPS_STATE'] = geocoded_results_df['FIPS_STATE'].astype(int).astype(str)
geocoded_results_df['FIPS_COUNTY'] = geocoded_results_df['FIPS_COUNTY'].astype(int).astype(str)
geocoded_results_df['CENSUS_TRACT'] = geocoded_results_df['CENSUS_TRACT'].astype(int).astype(str)
# Left pad geograpgy values wit zeros
geocoded_results_df['FIPS_STATE'] = geocoded_results_df['FIPS_STATE'].apply('{:0>2}'.format)
geocoded_results_df['FIPS_COUNTY'] = geocoded_results_df['FIPS_COUNTY'].apply('{:0>3}'.format)
geocoded_results_df['CENSUS_TRACT'] = geocoded_results_df['CENSUS_TRACT'].apply('{:0>6}'.format)
# Create a unique geographic identifier by combining state, county and cenus tract code for each row.
geocoded_results_df["GeoID"] = geocoded_results_df["FIPS_STATE"] \
+ geocoded_results_df["FIPS_COUNTY"] \
+ geocoded_results_df["CENSUS_TRACT"]
# Split the LONG_LAT column into separate Longitude and Latitude columns
geocoded_results_df[['Longitude', 'Latitude']] = geocoded_results_df['LONG_LAT'].str.rsplit(',', 1, expand=True)
# Merge the Sears Kit Home style for each location from the original address list with the geocoded results.
mapping_data_df = pd.merge(left = address_df[['row_num','Model','Address','City','State','Auth']],
right = geocoded_results_df,
how = 'right',
left_on = 'row_num',
right_on = 'ID')
# Examine some of the results
mapping_data_df.head()
# Build a list containing all the coordinates so they be plotted on the map
locations = mapping_data_df[['Latitude', 'Longitude']]
locationlist = locations.values.tolist()
The map below shows the CONFIRMED (authenticated) kit homes using DARK BLUE markers and the UNCONFIRMED (not authenticated) kit homes using LIGHT BLUE markers.
The icon that looks like a stack of square pancakes in the upper right corner of the map is the "layer control". You can use it to hide or show the confirmed and unconfirmed markers.
The + and - icons in the upper left of the map lets you zoom in and out. If your mouse has a wheel, you can use it to zoom. If you have a track pad you may be able to swipe it to zoom as well.
The numbers in the blue rectangles represent the number of houses in the group it represents. Clicking on a numbered marker zooms in and separates the big group into smaller groups. The groups themselves don't mean anything. They are just there to make the map look less cluttered instead of displaying thousands of markers. There is no way to turn off grouping to see all the individual markers at once.
Once you zoom in far enough you will see individual markers that tag a single location. These are the markers with a little "i" in the center. If you click on one of these you will see the address, the model name and whether it has been confirmed.
Something that is kind of fun to do is to highlight the address (just the address) in the pop-up tag, right-click the highlight, and select "Search with Google" (it might say something different depending on your browser). In most cases it will bring up a Google Street View page for the house. There is no charge for this sort of use of Street View.
### Define functions to set the color of cluster markers. Confirmed and unconfirmed locations have
### different colors. This gets used by all maps.
# This sets the color for CONFIRMED locations clusters.
icon_create_function_confirmed = """
function(cluster) {
var childCount = cluster.getChildCount();
/*
// comment: can have something like the following to modify the different cluster sizes....
var c = ' marker-cluster-';
if (childCount < 50) {
c += 'large';
} else if (childCount < 300) {
c += 'medium';
} else {
c += 'small';
}
// The marker-cluster-<'size'> gets passed in the "return new L.DivIcon()" function below.
*/
return new L.DivIcon({ html: '<div><span style="background-color:darkblue;color:white;font-size: 20px;">' + childCount + '</span></div>', className: 'marker-cluster', iconSize: new L.Point(40, 30) });
}
"""
# This sets the color of UNCONFIRMEDlocation clusters.
icon_create_function_unconfirmed = """
function(cluster) {
var childCount = cluster.getChildCount();
return new L.DivIcon({ html: '<div><span style="background-color:lightblue;color:black;font-size: 20px;">' + childCount + '</span></div>', className: 'marker-cluster', iconSize: new L.Point(40, 30) });
}
"""
# Create a map using the Map() function and the coordinates of the locations of all the homes.
# Map starts out centered on Ohio.
mp = folium.Map(location=[40.367474, -82.996216], zoom_start=7, width=900, height=550, control_scale=True)
# Ohio_map
# Feature groups allow customization of layer control labels so they don't have to say "macro blah...""
fg_confirmed = folium.FeatureGroup(name = 'Confirmed Locations', show = True)
mp.add_child(fg_confirmed)
fg_unconfirmed = folium.FeatureGroup(name = 'Unconfirmed Locations', show = True)
mp.add_child(fg_unconfirmed)
# Add the Marker clusters for confirmed and unconfirmed locations to feature group
marker_cluster_confirmed = plugins.MarkerCluster(icon_create_function = icon_create_function_confirmed).add_to(fg_confirmed)
marker_cluster_unconfirmed = plugins.MarkerCluster(icon_create_function=icon_create_function_unconfirmed).add_to(fg_unconfirmed)
# A function to choose a marker color depending on if the house is a confirmed kit house or not.
# The individual location markers use the same color as their cluster markers.
def getcolor(auth_val):
if auth_val == 'YES':
return ("darkblue", "Confirmed")
return ("lightblue","Unconfirmed")
### Add a layer to the map shpwing Confirmed kit homes
# Loop through all ther location pairs.
for point in range(0, len(locationlist)):
try:
clr, status = getcolor(mapping_data_df["Auth"][point])
if status == "Confirmed":
folium.Marker(
location = locationlist[point],
popup = status + " " + mapping_data_df['Model'][point] + ": " + mapping_data_df['ADDRESS_OUT'][point],
icon = folium.Icon(color = clr)
).add_to(marker_cluster_confirmed)
except Exception: # not all addresses could be geocoded so skip them if coordinates are missing
pass
### Add a layer to the map showing Unconfirmed kit homes
for point in range(0, len(locationlist)):
try:
clr, status = getcolor(mapping_data_df["Auth"][point])
if status == "Unconfirmed":
folium.Marker(
location = locationlist[point],
popup = status + " " + mapping_data_df['Model'][point] + ": " + mapping_data_df['ADDRESS_OUT'][point],
icon = folium.Icon(color = clr)
).add_to(marker_cluster_unconfirmed)
except Exception: # not all addresses could be geocoded so skip them if coordinates are missing
pass
# add layer control to map (allows layer to be turned on or off)
folium.LayerControl().add_to(mp)
# Display the map
mp