November 2023 – The results of health inspections of food preparation locations in Delaware are in a publicly available database. This project displays an interactive map showing their locations and indicators of positive or negative inspection results. Scroll to the bottom of the report to see the map. The mapping function has not been optimized for speed, so it is a little slow to respond. Give it ~15 seconds to load before examining it. This is a work in progess.
January 2022 – The Sears Kit Home Report project reflects my fascination with the idea of a complete house in the form of a big DIY kit. It examines the locations of these homes in Ohio where many of the kit homes are located, and tries to come up with clues of where to look for them by Sears kit home hunters.
October 2021 – Part of the development of the Sears Kit Home Report involved learning how to programmatically downloading pictures of homes from Google Street View images.
October 2021 – Part of what I wanted to do with the Sears Kit Home project was to develop a machine learning model that would detect the distinctive style of the homes. I would have the model crawl through Google Maps Street View and send me an alert when it found a possible kit home. This project explored how to download random house pictures from Google Open Images and use them as training material for the model.
October 2021 – An important step in finding Sears kit houses in Google Street View images is to isolate houses using bounding boxes. The House Detection project is meant to do that using a machine learning technology called YOLO (You Only Look Once)
August 2021 – The House Finder Model project explored using an Xception neural network to identify arts and craft homes in images of houses. It showed promise, but needed a lot more training examples to be very effective.
May 2021 – Pizza Hunt was an interesting attempt to use Delaware’s publicly available business license database and Google Reviews to first identify all the pizza joints in Delaware and then pick out the ones that were most highly rated by customers. The results were limited by Google charging for complete access to all the reviews programatically. As a result I could only get the most recent 5 reviews for each establishment without having to pay.
June 2021 – I wanted to be able to monitor the performance of the investment options available to me in my company’s 403b program, so I wrote How to Download Stock Data Using Python. It uses the yFinance package to download stock, bond and ETF data from YahooFinance.
March 2021 – The next project explored the effectiveness of Portfolio Balancing in a retirement account. PB is a classic investment strategy where you try to maintain a constant ratio between stocks and bonds. With this strategy, when stock prices are down you sell some bonds to buy stock. When stock prices are up, you sell some stock and buy bonds, all to maintain the desired ratio between the two. I took into account a further complication that once you reach a certain age you must start taking minimum annual distributions from your traditional IRA, 401a or 403b – and you have to pay taxes on the distributions.
December 2016 – The Smoking Map shows the percentage of the population in each US state that smokes tobacco on a daily basis
April 2021 – The U.S. Census Bureau provides an extensive API for downloading census data. The next project introduces some of the basics for doing that.
November 2019 – Downloading census data is more complicated than you might think. A major challenge is first figuring out which of the MANY tables contain the data you are looking for. The next project provides essential tips and tricks for choosing the right data table and column in your queries.
March 2020 – The next project is a complete walkthrough of how to download around 25 different socio-economic metrics (median income, educational attainment, etc.) for a range of years, using the U.S. Census Bureau API.
February 2024 – The company where I worked had strict rules prohibiting sending customer addresses over the Internet in order to geocode their locations. I figured out how to use the longitude/latitude of the addresses to pinpoint their census tract location. The solution maintained customer information privacy and was approximately 500 times faster than using the Census Bureau geocoding service.
May 2018 – There once was a popular analytical approach called text mining. I used it to prepare a deconstruction of all the Seinfeld episode scripts to answer earth shaking questions like “what was the most frequent word uttered? (yeah)”.
October 2018 – A frivolous use of Recurring Neural Networks (RNNs) was a program I wrote to generate new and interesting names for drugs. A statistician friend who worked for a pharmaceutical company loved it, but said companies had a defined process for naming drugs according to their purpose and effect, so no chance it would ever be used.
September 2016 – Prior to the presidential election of 2016 I trained a model to analyze political speech to classify it as either Conservative or Liberal. I then applied it to the speeches of the two leading presidential candidates of the time to see how strong their leanings were.
August 2022 – The little town where I live has at its center an intersection of Broad Street and Main Street. I wondered how many other towns and cities also had both a Broad Street and a Main Street.
September 2022 – Once I had the list of towns that have both a Broad Street and a Main Street, I tried to find the ones where the two streets intersected near the geographic center of the town.