Broad and Main Cities¶
M. Fawcett 08/02/2022
Use Overpass API for querying Open Street Map database to find cities and towns that have both a "Main street" and a "Broad Street".
Code snippets: https://janakiev.com/blog/openstreetmap-with-python-and-overpass-api/ City list: https://simplemaps.com/data/us-cities
In [ ]:
# Install the API package
# !pip install overpy
In [ ]:
# Install package that converts longiyude/latitude formats
# !pip install lat-lon-parser
In [1]:
import pandas as pd
import time
from lat_lon_parser import parse
import overpy
import csv
from csv import DictWriter
import os
import datetime
# Convert location example
lng = parse("45° 12.6' W")
print(lng)
-45.21
/Users/mitchellfawcett/anaconda3/lib/python3.7/site-packages/pandas/compat/_optional.py:138: UserWarning: Pandas requires version '2.7.0' or newer of 'numexpr' (version '2.6.9' currently installed). warnings.warn(msg, UserWarning)
In [2]:
print(datetime.datetime.now())
2022-08-28 19:57:33.652780
Cities & towns¶
Load a list of cities and towns and their longitude and latitude
In [3]:
placesdf = pd.read_csv("uscities.csv")
placesdf = placesdf.reset_index() # make sure indexes pair with number of rows
placesdf.head()
Out[3]:
index | city | city_ascii | state_id | state_name | county_fips | county_name | lat | lng | population | density | source | military | incorporated | timezone | ranking | zips | id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | New York | New York | NY | New York | 36081 | Queens | 40.6943 | -73.9249 | 18680025 | 10768.0 | shape | False | True | America/New_York | 1 | 11229 11228 11226 11225 11224 11222 11221 1122... | 1840034016 |
1 | 1 | Los Angeles | Los Angeles | CA | California | 6037 | Los Angeles | 34.1141 | -118.4068 | 12531334 | 3267.0 | shape | False | True | America/Los_Angeles | 1 | 91367 90291 90293 90292 91316 91311 90035 9003... | 1840020491 |
2 | 2 | Chicago | Chicago | IL | Illinois | 17031 | Cook | 41.8375 | -87.6866 | 8586888 | 4576.0 | shape | False | True | America/Chicago | 1 | 60018 60649 60641 60640 60643 60642 60645 6064... | 1840000494 |
3 | 3 | Miami | Miami | FL | Florida | 12086 | Miami-Dade | 25.7840 | -80.2101 | 6076316 | 4945.0 | shape | False | True | America/New_York | 1 | 33128 33129 33125 33126 33127 33149 33144 3314... | 1840015149 |
4 | 4 | Dallas | Dallas | TX | Texas | 48113 | Dallas | 32.7935 | -96.7667 | 5910669 | 1522.0 | shape | False | True | America/Chicago | 1 | 75098 75287 75230 75231 75236 75237 75235 7525... | 1840019440 |
In [9]:
print(len(placesdf))
30409
In [4]:
# Define function that finds if a city has a Broad Street and a Main Street within 1500 meters (~1 mile)
# of its geographic center
def find_streets(index_number, city_name, state_name, lat, lng, city_id):
# Returns return_value:
# "Found" if location has both Broad and Main St
# "NotFound" if location does not have both Broad and Main st
# "Exception" if Overpass query throws an exception
s = """way(around:1500, """ + str(lat) + """, """ + str(lng) + """)[highway][name];out body;"""
# result = api.query("""
# way(around:1500, 39.4450, -75.7183)[highway][name];
# out body;
# """)
api = overpy.Overpass()
try:
result = api.query(s)
has_Broad = 0
has_Main = 0
for way in result.ways:
if way.tags.get("name", "n/a").endswith('Broad Street'): has_Broad = 1
if way.tags.get("name", "n/a").endswith('Main Street'): has_Main = 1
# print("Name: %s" % way.tags.get("name", "n/a"))
# print(" Highway: %s" % way.tags.get("highway", "n/a"))
if has_Broad + has_Main == 2:
print("FOUND BROAD & MAIN IN: " + city_name + ', ' + state_name)
return "Found"
break;
except Exception as e:
print(e)
print(s)
return "Exception"
return "NotFound"
In [5]:
### Initialize a CSV file with column headings if it does not already exist
# list of column names
field_names = ['Index','City','State','CityID',
'Latitude','Longitude']
csv_name = 'BroadMainCities.csv'
if not os.path.isfile(csv_name):
# Start a CSV file and assign column headers
with open(csv_name, 'w') as file_object:
dw = csv.DictWriter(file_object, delimiter=',',
fieldnames = field_names)
dw.writeheader()
#Close the file object
file_object.close()
In [6]:
### Loop through the places found in the cities csv file
r = 0 # counter
idx_start = 30408
for index, row in placesdf[idx_start:].iterrows():
index_number = index
city_name = row['city']
state_name = row['state_name']
lat = row['lat']
lng = row['lng']
city_id = str(row['id'])
# Call the function that queries Overpass with coordinates of a location
# Put this in a loop that retries a location if the Overpass server is overloaded
retry_counter = 0 # keeps track of the number of times a location is queried
retry_max = 3 # Maxiumum number of tries to query a location because server is overloaded
while retry_counter < retry_max:
print(index, datetime.datetime.now())
found = find_streets(index_number, city_name, state_name, lat, lng, city_id)
# 'found' will be either "NotFound", "Found" or "Exception"
if found == 'Found': # Location has both a Broad and Main st
print(index_number, city_name, state_name, lat, lng, city_id, 'Found = ', found)
# Add the result of city found with Broad and Main streets to a dataframe
found_dict = {'Index': index_number, 'City': city_name, 'State': state_name, 'CityID': city_id, \
'Latitude': lat, 'Longitude': lng}
# Append the city with Broad and Main sts to a CSV file
# Open the CSV file in append mode
with open(csv_name, 'a') as file_object:
# Pass the file object and a list
# of column names to DictWriter()
# You will get a object of DictWriter
dictwriter_object = DictWriter(file_object, fieldnames=field_names)
#Pass the dictionary as an argument to the Writerow()
dictwriter_object.writerow(found_dict)
#Close the file object
file_object.close()
break
if found == 'NotFound':
break # Done with this location. Go on to next location
if found == 'Exception':
# wait a bit so the server doesn't overload
time.sleep(10) # 10 seconds
retry_counter += 1
continue # Try querying the location again
# wait a bit before trying the next location
time.sleep(1) # 1 second
r += 1
if r > 5000:
print("r hit limit")
break;
30408 2022-08-28 19:57:37.971709
In [ ]:
print(r)
In [ ]:
print(index)
In [10]:
broadandmain_df = pd.read_csv("BroadMainCities.csv")
In [11]:
broadandmain_df
Out[11]:
Index | City | State | CityID | Latitude | Longitude | |
---|---|---|---|---|---|---|
0 | 42 | Providence | Rhode Island | 1840003289 | 41.8230 | -71.4187 |
1 | 51 | Bridgeport | Connecticut | 1840004836 | 41.1918 | -73.1954 |
2 | 54 | Hartford | Connecticut | 1840004773 | 41.7661 | -72.6834 |
3 | 65 | Rochester | New York | 1840000373 | 43.1680 | -77.6162 |
4 | 105 | Winston-Salem | North Carolina | 1840015324 | 36.1029 | -80.2610 |
... | ... | ... | ... | ... | ... | ... |
513 | 27836 | Lake City | Kansas | 1840030111 | 37.3569 | -98.8279 |
514 | 27869 | McFarlan | North Carolina | 1840016448 | 34.8148 | -79.9766 |
515 | 27906 | Sellers | South Carolina | 1840017962 | 34.2826 | -79.4724 |
516 | 28564 | Luray | Missouri | 1840012115 | 40.4524 | -91.8841 |
517 | 28601 | Swink | Oklahoma | 1840026983 | 34.0169 | -95.2018 |
518 rows × 6 columns
In [ ]: