01 Philly’s Destinations
What are the major destinations of Philadelphia? When, how, and why do people travel? Replica’s data may help us understand the question. Replica provides modeled trip-level data in a typical weekday using multivarious sources:
- Census and ACS;
- Travel surveys;
- In-auto GPS data;
- Data from transit agencies;
Here Replica talks more about data sources and its methodology. We subset only trips that start and end in Philadelphia. The modeled data for Philadelphia is regarded as trustworthy.
Data processing
Replica’s data includes trip origin and destination (block-group-level), trip distance and durations, and trip-taker demographis data for every trip. We selected a few columns for our purpose. The data processing script can be found here.
Philadelphia’s destinations
Using datashader
with the python hvplot
library, we plotted every destination in Philadelphia at different times of the day. We can see that there are two peak hours, one in the morning, and the other in the afternoon. Destinations are highly concentrated in Center City, but are more diffused in the afternoon
Note: each dot represents an arrival. The dot is randomly generated within the block group in which the arrival falls.
Destination typology
It is reasonable that the overall number of arrivals have two peaks in one day, as shown in the below graph, but what about each individual block group?
We can do a clustering study using the k-means method. For each block group, we calculated the number of arrivals of each hour, and normalized the arrival couns by the trip count of 12-1pm of each block group. Using a scree-plot method, we chose $k=8$ as the number of clusters.
The interactive chart demonstrate the “arrival pattern” of each block group. Click-select the block group(s) in the left bar-plot and observe the arrival patterns in the right plot. We can see that some clusters have two clear peaks, whereas other clusters have one dominant peak. The clusters with the morning peak indicates more business-oriented land uses, and those with afternoon-peaks indicate residential or some commercial land uses.
A map of the different types of destinations is shown as follows.
Destinations of different trip purposes
The below four charts document the destinations of different trip purposes. Each dot represents an arrival and is randomly generated within the block group in which the arrival falls. Compared to eating and shopping trips, work destinations are most spatially concentrated. Shopping destinations and schooling destinations are more spatially diffused.
To find out the more specific OD’s of these trips, read the next blog.
Code excerpt
Code excerpt to generate random points in a polygon. Adapted from Nick Hand’s course materials.
# Function to generates random points inside polygon
from shapely.geometry import Point
def generate_rand_pts_in_polygon(number, polygon):
"""
Takes a shapely polygon, and the number of pts to be generated within this polygon
Returns a list of shapely points randomly generated within this polygon
"""
points = []
# Get the bounds of the polygon
try:
min_x, min_y, max_x, max_y = polygon.bounds
# Randomly generate points within the bounds
# If falls in polygon, adds count
# If falls outside of the polygon, disgard
i = 0
while i < number:
this_point = Point(np.random.uniform(min_x, max_x), np.random.uniform(min_y, max_y))
if polygon.contains(this_point):
points.append(this_point)
i = i + 1
return points
except:
return []
# Function to make a pandas DataFrame of the x, y coords of each generated trips,
# randomly distributed in the corresponding block groups
def shade_destinations(df):
# Join a geometry for later generating random points
df = phila_block_groups_3857[['GEOID', 'geometry']].merge(
df,
how = 'right',
left_on = 'GEOID',
right_on = 'destination_GEOID'
)
pts = df.apply(
lambda row: generate_rand_pts_in_polygon(
row['count'], row['geometry']
),
axis = 1,
)
pts = gpd.GeoSeries(pts.apply(pd.Series).stack(), dtype = object, crs = 'EPSG:3857')
pts.name = 'geometry'
pts = gpd.GeoDataFrame(pts)
# Calculate the X and Y coordinates
pts['x'] = pts.geometry.x
pts['y'] = pts.geometry.y
pts = (pts.reset_index())[['x', 'y', 'geometry']]
return pts
Using the package imageio
to create a GIF from a series of images.
# For each hour, plot each destination, randomly distributed within the block group it belongs
for this_hour in range(24):
this_hour_df = arrivals_by_hour.where(lambda x: x['trip_end_time'] == this_hour).dropna()
this_pts = shade_destinations(this_hour_df)
fig, ax = plt.subplots(figsize = (9, 12))
dsshow(this_pts[['x', 'y']], ds.Point('x', 'y'), norm = 'eq_hist', cmap = 'inferno', ax = ax)
phila_block_groups_3857.plot(facecolor = "none", edgecolor = "white", linewidth = 0.2, alpha = 0.5, ax = ax)
ax.text(min_x + x_span * 0.07, max_y - y_span * 0.07, f'{str(this_hour).zfill(2)}:00', color = "#ffffff", fontsize = 16)
ax.set_xlim(min_x, max_x)
ax.set_ylim(min_y, max_y)
ax.grid(False)
ax.axis("off")
print(f'{str(this_hour).zfill(2)}:00' + ' completed')
fig.savefig(f'../data/destination-by-hour/hour{this_hour}', bbox_inches='tight')
# Make a GIF of the different hours
gif_images = []
for this_hour in range(24):
gif_images.append(imageio.imread(f'../data/destination-by-hour/hour{this_hour}.png'))
kargs = { 'duration': 0.5 }
imageio.mimsave('../data/destination-by-hour/destination-by-hour.gif', gif_images, **kargs)