Tutorial 1 Geospatial data types, operations and processing#
In this tutorial, we will introduce the fundamental geospatial data types and processing techniques with GeoPandas, which is a Python library for working with geospatial data. It extends the capabilities of Pandas to handle spatial data for users to perform geospatial operations and analyses easily.
All the data can be downloaded at this link (click)
# Import required libraries
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')
Geopandas dataframe (GeoDataFrame) contains three parts:
index: The index column is a unique identifier for each row in the GeoDataFrame. It can be a simple integer index or a more complex index based on the data.
data: The data columns contain the attribute information associated with each geometric feature. These columns can include various data types, such as integers, floats, strings, and dates.
geometry: The geometry column contains the geometric representation of the features (e.g., points, lines, polygons). It is a special column that stores geometric objects using Shapely library.
Geospatial data types:
Vector data: Represents geographic features using points, lines, and polygons (geometry types). Each feature has associated attributes stored in a table. Common file formats include Shapefile, GeoJSON, and KML.
Feature |
Shapefile |
GeoJSON |
KML |
|---|---|---|---|
Format Type |
Binary (requires multiple files) |
Text (JSON-based, lightweight) |
Text (XML-based, heavier) |
Main Usage |
Professional GIS software |
Web mapping and APIs (e.g., Leaflet, Mapbox) |
Google Earth, simple visualization |
File Size & Efficiency |
Efficient but needs .shp, .shx, .dbf files together |
Light, easy to transfer over web |
Larger files, less ideal for large datasets |
Big data format:
Feature |
Parquet / GeoParquet |
GeoPackage ( |
GeoJSONSeq / JSONL |
|---|---|---|---|
Format Type |
Binary (columnar, optimized for analytics) |
Binary (SQLite-based single-file database) |
Text-based (line-delimited GeoJSON) |
Main Usage |
Big data processing, cloud analytics, fast I/O |
Desktop GIS, mobile apps, portable multi-layer data |
Streaming spatial data, logging, web pipelines |
File Size & Efficiency |
Very compact, highly efficient for large/tabular data |
Compact, self-contained but larger than Parquet |
Lightweight per line, but not efficient at scale |
Multi-layer Support |
1 table per file, but easily batched |
Fully supports multiple vector and raster layers |
No layer structure |
Raster data: Represents geographic information as a grid of pixels, where each pixel has a value representing a specific attribute (e.g., elevation, temperature). Common formats include GeoTIFF and NetCDF.
1.1 Fundamental geometric objects#
Geometry types / geometric objects:
Points: Represent discrete locations (e.g., cities, landmarks).
MultiPoint: A collection of multiple points, often used to represent clusters of discrete locations (e.g., a group of cities).
LineString: Represent linear features (e.g., roads, rivers).
MultiLineString: A collection of multiple lines, often used to represent complex linear features (e.g., a river with multiple branches).
Polygons: Represent areas (e.g., countries, lakes).
MultiPolygon: A collection of multiple polygons, often used to represent complex areas (e.g., a country with multiple islands).
We use a python library called Shapely to handle nad process the geometric objects. We don’t need to install it separately as it is already included in the GeoPandas library.
# Import the required libraries
from shapely.geometry import Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon
1.1.1 Point#
# Create a point object
point1 = Point(1, 2)
point2 = Point(3, 4)
# The point object
point1
# Print the point objects type in python
type(point1)
shapely.geometry.point.Point
# Or we can use the geom_type to check the type of the point object
point1.geom_type
'Point'
Point attributes
# We can also check the coordinate info (x, y) of the point object
print(list(point1.coords), point1.x, point1.y)
[(1.0, 2.0)] 1.0 2.0
Distance between two points
While the distance between two points is calculated using the Euclidean distance formula, which is the straight-line distance between two points in a Cartesian coordinate system (we will introduce the coordinate reference system (CRS) later). In other words, checking measurement unit (meter, feet or mile) in the CRS you’re using is important.
# Calculate the distance between two points
point1.distance(point2)
2.8284271247461903
Creating a GeoDataframe with df
# Create a GeoDataFrame with point1 and point2
# Create a DataFrame with point1 and point2
df = pd.DataFrame({'name': ['pt1', 'pt2'], 'geometry': [point1, point2]})
# Create a GeoDataFrame from the DataFrame
gdf1 = gpd.GeoDataFrame(df, geometry='geometry')
gdf1
| name | geometry | |
|---|---|---|
| 0 | pt1 | POINT (1 2) |
| 1 | pt2 | POINT (3 4) |
If there is a large set of coords in a file, we can use gpd.points_from_xy() to create a GeoDataFrame with points from the x and y coordinates.
# Create a DataFrame with x and y coordinates
df = pd.DataFrame({'x': [1, 2, 3, 6], 'y': [4, 5, 6, 8]})
# Create a GeoDataFrame from the DataFrame
gdf2 = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df.x, df.y))
gdf2
| x | y | geometry | |
|---|---|---|---|
| 0 | 1 | 4 | POINT (1 4) |
| 1 | 2 | 5 | POINT (2 5) |
| 2 | 3 | 6 | POINT (3 6) |
| 3 | 6 | 8 | POINT (6 8) |
1.1.2 LineString#
# Create a LineString object
line1 = LineString([(0, 0), (1, 1), ]) # A line is made of two points
line2 = LineString([(0, 0), (1, 2), (2, 2)]) # A line is made of three points
line3 = LineString([(0, 0), (0, 2), (2, 3), (3, 1)]) # A line with four points
# The LineString object
line1
# The LineString object
line2
# The LineString object
line3
# Print the LineString objects type in python
type(line1)
shapely.geometry.linestring.LineString
# Or we can use the geom_type to check the type of the LineString object
line1.geom_type
'LineString'
LineString attributes
# We can also check the coordinate info (x, y) of points within the LineString object
print('line1', list(line1.coords), line1.xy)
print('line2', list(line2.coords), line2.xy)
print('line3', list(line3.coords), line3.xy)
line1 [(0.0, 0.0), (1.0, 1.0)] (array('d', [0.0, 1.0]), array('d', [0.0, 1.0]))
line2 [(0.0, 0.0), (1.0, 2.0), (2.0, 2.0)] (array('d', [0.0, 1.0, 2.0]), array('d', [0.0, 2.0, 2.0]))
line3 [(0.0, 0.0), (0.0, 2.0), (2.0, 3.0), (3.0, 1.0)] (array('d', [0.0, 0.0, 2.0, 3.0]), array('d', [0.0, 2.0, 3.0, 1.0]))
Calculate the length of the LineString object
# Calculate the length of the LineString object
line1.length, line2.length, line3.length
(1.4142135623730951, 3.23606797749979, 6.47213595499958)
The centroid of a LineString
The centroid of a LineString is the point that represents the geometric center of the line. It is calculated as the average of the coordinates of all points in the LineString. The centroid is not necessarily a point on the line itself, but it is the point that minimizes the distance to all points on the line.
# Calculate the centroid of the LineString object
line1.centroid, line2.centroid, line3.centroid
(<POINT (0.5 0.5)>, <POINT (0.809 1.309)>, <POINT (1.209 1.864)>)
# we use matplotlib to plot the LineStrings and their centroids
import matplotlib.pyplot as plt
# Create a figure and axis
fig, ax = plt.subplots(figsize=(6, 6))
# Plot the LineString object
x, y = line1.xy
ax.plot(x, y, color='blue', linewidth=2, label='LineString 1')
x, y = line2.xy
ax.plot(x, y, color='red', linewidth=2, label='LineString 2')
x, y = line3.xy
ax.plot(x, y, color='green', linewidth=2, label='LineString 3')
# Plot the centroid of the LineString object
ax.plot(line1.centroid.x, line1.centroid.y, 'o', color='blue', markersize=10, label='Centroid 1')
ax.plot(line2.centroid.x, line2.centroid.y, 'o', color='red', markersize=10, label='Centroid 2')
ax.plot(line3.centroid.x, line3.centroid.y, 'o', color='green', markersize=10, label='Centroid 3')
ax.grid()
# Add a title and labels
ax.set_title('LineString and Centroid')
ax.set_xlabel('X')
ax.set_ylabel('Y')
# Add a legend
ax.legend()
# Show the plot
plt.show()
1.1.3 Polygon#
Building a polygon is not as simple as a point or a line. A polygon is made of multiple points, and the first and last points must be the same to close the polygon. The points are connected in the order they are defined, forming the edges of the polygon.
# Create a Polygon object
polygon1 = Polygon([(0, 0), (1, 1), (1, 0), (0, 0)]) # A polygon is made of three points
polygon2 = Polygon([(1, 1), (2, 4), (3, 4), (4, 2), (1, 1)]) # A polygon is made of four points
polygon3 = Polygon([(1, 2), (2, 5), (3, 4),(5, 5), (3, 2), (1, 2)]) # A polygon is made of five points
# The Polygon object
polygon1
# The Polygon object
polygon2
# The Polygon object
polygon3
# we can also create a polygon using the LineString object, the polygon is made of four points, it will be closed automatically.
polygon4 = Polygon(line3)
# The Polygon object
polygon4
# we can also create a polygon with a hole using sell and hole
# The outer boundary of the polygon
# The inner boundary of the polygon (the hole)
outer_boundary = [(0, 0), (4, 0), (4, 4), (0, 4), (0, 0)]
inner_boundary = [(1, 1), (1, 3), (3, 3), (3, 1), (1, 1)]
polygon5 = Polygon(shell=outer_boundary, holes=[inner_boundary])
# The Polygon object
polygon5
# Print the Polygon objects type in python
type(polygon1)
shapely.geometry.polygon.Polygon
# Or we can use the geom_type to check the type of the Polygon object
polygon1.geom_type
'Polygon'
Polygon attributes
# We can also check the coordinate info (x, y) of points within the Polygon object
print('polygon2', list(polygon2.exterior.coords), polygon2.exterior.xy)
polygon2 [(1.0, 1.0), (2.0, 4.0), (3.0, 4.0), (4.0, 2.0), (1.0, 1.0)] (array('d', [1.0, 2.0, 3.0, 4.0, 1.0]), array('d', [1.0, 4.0, 4.0, 2.0, 1.0]))
# get the exterior and interior coordinates of the polygon5
print('polygon5', list(polygon5.exterior.coords), polygon5.exterior.xy)
print('polygon5', list(polygon5.interiors[0].coords), polygon5.interiors[0].xy)
polygon5 [(0.0, 0.0), (4.0, 0.0), (4.0, 4.0), (0.0, 4.0), (0.0, 0.0)] (array('d', [0.0, 4.0, 4.0, 0.0, 0.0]), array('d', [0.0, 0.0, 4.0, 4.0, 0.0]))
polygon5 [(1.0, 1.0), (1.0, 3.0), (3.0, 3.0), (3.0, 1.0), (1.0, 1.0)] (array('d', [1.0, 1.0, 3.0, 3.0, 1.0]), array('d', [1.0, 3.0, 3.0, 1.0, 1.0]))
# The exterior length of the polygon
print('polygon2', polygon2.exterior.length)
polygon2 9.56062329783655
# The exterior and interior length of the polygon5
print('polygon5', polygon5.exterior.length, polygon5.interiors[0].length)
polygon5 16.0 8.0
# Calculate the area of the Polygon object
polygon1.area, polygon2.area, polygon3.area, polygon4.area, polygon5.area
(0.5, 5.0, 6.0, 5.5, 12.0)
# we use matplotlib to plot the Polygons and their centroids
# Create a figure and axis
fig, ax = plt.subplots(figsize=(6, 6))
# Plot the Polygon object
x, y = polygon1.exterior.xy
ax.plot(x, y, color='blue', linewidth=2, label='Polygon 1')
x, y = polygon2.exterior.xy
ax.plot(x, y, color='red', linewidth=2, label='Polygon 2')
x, y = polygon3.exterior.xy
ax.plot(x, y, color='green', linewidth=2, label='Polygon 3')
# Plot the centroid of the Polygon object
ax.plot(polygon1.centroid.x, polygon1.centroid.y, 'o', color='blue', markersize=10, label='Centroid 1')
ax.plot(polygon2.centroid.x, polygon2.centroid.y, 'o', color='red', markersize=10, label='Centroid 2')
ax.plot(polygon3.centroid.x, polygon3.centroid.y, 'o', color='green', markersize=10, label='Centroid 3')
ax.grid()
# Add a title and labels
ax.set_title('Polygon and Centroid')
ax.set_xlabel('X')
ax.set_ylabel('Y')
# Add a legend
ax.legend()
# Show the plot
plt.show()
1.1.4 MultiPoint, MultiLineString, and MultiPolygon#
# Create a MultiPoint object
multipoint = MultiPoint([point1, point2])
# The MultiPoint object
multipoint
# Print the MultiPoint objects type in python
type(multipoint)
shapely.geometry.multipoint.MultiPoint
# Or we can use the geom_type to check the type of the MultiPoint object
multipoint.geom_type
'MultiPoint'
# Create a MultiLineString object
multiline = MultiLineString([line1, line2, line3])
# The MultiLineString object
multiline
# Print the MultiLineString objects type in python
type(multiline)
shapely.geometry.multilinestring.MultiLineString
# Or we can use the geom_type to check the type of the MultiLineString object
multiline.geom_type
'MultiLineString'
# Create a MultiPolygon object
multipolygon = MultiPolygon([polygon1, polygon2, polygon3])
# The MultiPolygon object
multipolygon
# Print the MultiPolygon objects type in python
type(multipolygon)
shapely.geometry.multipolygon.MultiPolygon
# Or we can use the geom_type to check the type of the MultiPolygon object
multipolygon.geom_type
'MultiPolygon'
Noted: we can also get the length, area, centroid, and distance of the MultiPoint, MultiLineString, and MultiPolygon objects.
Please review the geopandas.GeoSeries.distance at other geopandas.GeoSeries operations when you need to calculate large numbers of different points, polygon and linestrings
1.2 Map Projection#
Coordinate reference system
A Coordinate Reference System (CRS) defines how the Earth’s curved surface is represented on a flat map using coordinates. It specifies both the shape of the Earth (through a datum) and the method of projecting that shape onto a two-dimensional plane. Without a CRS, spatial data cannot be accurately positioned, nor can it be reliably combined with other datasets.
CRS Name |
EPSG Code |
Type |
Notes |
|---|---|---|---|
OSGB36 / British National Grid |
EPSG:27700 |
Projected |
The main CRS for mapping in Great Britain. Uses a Transverse Mercator projection and the OSGB36 datum. |
WGS 84 |
EPSG:4326 |
Geographic |
Used globally (e.g. GPS systems, Google Maps). Coordinates in latitude and longitude. |
Irish Grid (Ireland and Northern Ireland) |
EPSG:29902 |
Projected |
Separate grid for Ireland (but related principles). |
Web Mercator |
EPSG:3857 |
Projected |
Used by many web mapping applications (e.g. Google Maps, OpenStreetMap). Distorts areas and distances, but preserves angles. |
NAD83 / UTM Zone 10N |
EPSG:26910 |
Projected |
Used in North America. Based on the UTM system, which divides the world into a series of zones. |
EPSG (European Petroleum Survey Group) codes are a standardized set of identifiers for coordinate reference systems. They provide a unique identifier for each CRS, making it easier to reference and use them in GIS applications.
You can check all reference systems in the EPSG registry here (click).
Geographic vs. Projected Coordinate Systems
Type |
Geographic Coordinate System (GCS) |
Projected Coordinate System (PCS) |
|---|---|---|
Coordinates |
Latitude (Y) and Longitude (X), in degrees |
X and Y values in meters, feet, or other linear units |
Surface |
Curved (earth-like, 3D ellipsoid) |
Flat (2D map surface) |
Examples |
WGS84 (EPSG:4326), NAD83 |
UTM, State Plane, British National Grid |
Good for |
Global data, navigation, GPS |
Local maps, accurate distances, areas, engineering |
Issues |
Hard to measure real distances (degrees aren’t equal in size everywhere) |
Distortions (shape, area, distance, direction) — you have to choose which to minimize |
Now, we use the UK Countries Boundaries data downloaded from ONS Open Geography portal as an example with GeoPandas.
# Load the shapefile, please note the .shp file is a part of the shapefile, and you need to download all the files in the same folder.
gdf_uk_1 = gpd.read_file("data/Countries_December_2024_Boundaries_UK/CTRY_DEC_2024_UK_BFC.shp")
# We get the geo-dataframe gdf_uk_1, which contains the geometry column (which is a multipolygon) and other attribute columns (four rows refer to four countries).
gdf_uk_1
| CTRY24CD | CTRY24NM | CTRY24NMW | BNG_E | BNG_N | LONG | LAT | GlobalID | geometry | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | E92000001 | England | Lloegr | 394883 | 370883 | -2.07812 | 53.2350 | 5cad1ec2-bbe1-4ec4-bcd9-ba0cb9c3fc1f | MULTIPOLYGON (((83962.84 5401.15, 83970.68 540... |
| 1 | N92000002 | Northern Ireland | Gogledd Iwerddon | 86544 | 535337 | -6.85571 | 54.6150 | 8d8effb1-0159-4cd6-b856-21a8754b4693 | MULTIPOLYGON (((131198.094 468427.673, 131196.... |
| 2 | S92000003 | Scotland | Yr Alban | 277744 | 700060 | -3.97094 | 56.1774 | a158e058-71b1-4272-b4bf-91c241d13159 | MULTIPOLYGON (((265944.63 543512.72, 265945.83... |
| 3 | W92000004 | Wales | Cymru | 263405 | 242881 | -3.99418 | 52.0674 | c78b0dcc-7d89-42b2-9667-57aa91a55e74 | MULTIPOLYGON (((322081.699 165165.901, 322082.... |
# Geoseries is a class of geo-df that stores geometric representations using Shapely library.
type(gdf_uk_1.geometry)
geopandas.geoseries.GeoSeries
# the geometry column contains the geometric representation of the features (e.g., points, lines, polygons).
type(gdf_uk_1.geometry[0])
shapely.geometry.multipolygon.MultiPolygon
# Load the geojson file; you may observe that the geojson file is much smaller than the shapefile.
gdf_uk_2 = gpd.read_file("data/Countries_December_2024_Boundaries_UK.geojson")
# We get the geo-dataframe gdf_uk_2, which contains the same information as the gdf_uk_1 from shp.
# However, the values in the geometry column are different, but they represent the same multipolygon geometries.
gdf_uk_2
| FID | CTRY24CD | CTRY24NM | CTRY24NMW | BNG_E | BNG_N | LONG | LAT | GlobalID | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | E92000001 | England | Lloegr | 394883 | 370883 | -2.07812 | 53.23497 | bd411920-e7ea-4f71-b6c8-5f1d24ec92d3 | MULTIPOLYGON (((-6.34905 49.89822, -6.32842 49... |
| 1 | 2 | N92000002 | Northern Ireland | Gogledd Iwerddon | 86544 | 535337 | -6.85571 | 54.61502 | 652c0c4b-647b-4565-b9ed-e9c17ec5834c | MULTIPOLYGON (((-5.52389 54.67041, -5.52451 54... |
| 2 | 3 | S92000003 | Scotland | Yr Alban | 277744 | 700060 | -3.97094 | 56.17744 | 97bb1057-3e8d-4ad8-83ef-4577d1bb4d9c | MULTIPOLYGON (((-3.06033 54.98452, -3.06337 54... |
| 3 | 4 | W92000004 | Wales | Cymru | 263405 | 242881 | -3.99418 | 52.06742 | f7c86b8c-b705-44b7-bb7b-46323f7bddfe | MULTIPOLYGON (((-4.30971 51.56253, -4.31141 51... |
# Check the CRS of the gdf_uk_1
gdf_uk_1.crs
<Projected CRS: EPSG:27700>
Name: OSGB36 / British National Grid
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: United Kingdom (UK) - offshore to boundary of UKCS within 49°45'N to 61°N and 9°W to 2°E; onshore Great Britain (England, Wales and Scotland). Isle of Man onshore.
- bounds: (-9.01, 49.75, 2.01, 61.01)
Coordinate Operation:
- name: British National Grid
- method: Transverse Mercator
Datum: Ordnance Survey of Great Britain 1936
- Ellipsoid: Airy 1830
- Prime Meridian: Greenwich
# we can also plot the gdf_uk_1 check the projection: x and y are in metre.
gdf_uk_1.plot(color="skyblue", edgecolor="black", lw= 0.5, alpha=0.8, figsize=(10, 10))
<Axes: >
# Check the CRS of the gdf_uk_2
gdf_uk_2.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
# we can also plot the gdf_uk_2 check the projection: x and y are in degree.
gdf_uk_2.plot(color="skyblue", edgecolor="black", lw= 0.5, alpha=0.8, figsize=(10, 10))
<Axes: >
What is a map projection for coordinates:
A projection is a mathematical transformation that converts 3D geographic coordinates (latitude and longitude) into 2D Cartesian coordinates (X, Y).
# we can use geopandas to implement the projection of the gdf_uk_2, i.e., we can change the CRS of the gdf_uk_2 to the same as the gdf_uk_1.
gdf_uk_2 = gdf_uk_2.to_crs(epsg="27700")
# The coordinate information in the geometry column of gdf_uk_2 has been changed from degree to meter, i.e., the same as the gdf_uk_1.
gdf_uk_2
| FID | CTRY24CD | CTRY24NM | CTRY24NMW | BNG_E | BNG_N | LONG | LAT | GlobalID | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | E92000001 | England | Lloegr | 394883 | 370883 | -2.07812 | 53.23497 | bd411920-e7ea-4f71-b6c8-5f1d24ec92d3 | MULTIPOLYGON (((87796.624 8850.924, 89240.332 ... |
| 1 | 2 | N92000002 | Northern Ireland | Gogledd Iwerddon | 86544 | 535337 | -6.85571 | 54.61502 | 652c0c4b-647b-4565-b9ed-e9c17ec5834c | MULTIPOLYGON (((172876.341 536297.016, 172828.... |
| 2 | 3 | S92000003 | Scotland | Yr Alban | 277744 | 700060 | -3.97094 | 56.17744 | 97bb1057-3e8d-4ad8-83ef-4577d1bb4d9c | MULTIPOLYGON (((332243.348 566061.14, 332047.3... |
| 3 | 4 | W92000004 | Wales | Cymru | 263405 | 242881 | -3.99418 | 52.06742 | f7c86b8c-b705-44b7-bb7b-46323f7bddfe | MULTIPOLYGON (((239998.199 187379.085, 239876.... |
# we can also plot the gdf_uk_2 check the projection: x and y are transferred to metre.
gdf_uk_2.plot(color="skyblue", edgecolor="black", lw= 0.5, alpha=0.8, figsize=(10, 10))
plt.xlim(0.)
plt.ylim(0.)
# let y ticks show all the numbers
plt.ticklabel_format(style='plain', axis='y')
1.3 Geometric operations – Overlay#
In this section, we will learn how to perform overlay operations on geospatial data. Overlay operations are used to combine two or more layers of geospatial data to create a new layer that contains information from all the input layers. There are four main types of overlay operations: intersection, union, difference, and symmetric difference in Geopandas overlay function.
(Spatial overlay with two input vector layers (a_input = rectangle, b_input = circle). The resulting vector layer is displayed in green. QGIS documentation)
Here, we use the London Local authorities selected from UK local authority boundaries used in the previous week and the London Inner Ultra Low Emission Zone (ULEZ) boundary 2019 or called central London Congestion Charge Zone (CCZ) now. The London congestion charge zone is a designated area in central London where drivers are required to pay a fee to drive within the zone during certain hours. The ULEZ is an area in London where only vehicles that meet strict emissions standards can enter without paying a charge. (Please note the boundary of ULEZ 2019/CCZ was expanded in 2021 and 2023, but we use the CCZ boundary in this case).
# Read the UK Local Authority boundaries in geojson
gdf_uk_la = gpd.read_file("data/Local_Authority_Districts_December_2024_Boundaries_UK_BSC.geojson")
# Selecting London Local Authority boundaries by using the index in the LAD24CD column.
# As all London LA index code starts with 'E09', we can use the string method to filter the data.
gdf_uk_london = gdf_uk_la[gdf_uk_la['LAD24CD'].str.startswith('E09')]
# We can observe that London has 33 local authorities (33 rows in GeoDataFrame).
gdf_uk_london
| FID | LAD24CD | LAD24NM | LAD24NMW | BNG_E | BNG_N | LONG | LAT | GlobalID | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|
| 263 | 264 | E09000001 | City of London | 532382 | 181358 | -0.093520 | 51.51564 | 741710fd-03e1-4b41-8645-7ebcfc5961ac | POLYGON ((-0.07853 51.52151, -0.07687 51.51663... | |
| 264 | 265 | E09000002 | Barking and Dagenham | 547757 | 185111 | 0.129479 | 51.54556 | a2f59957-115c-478b-8c27-5162ee915dc7 | POLYGON ((0.15436 51.56611, 0.16189 51.56162, ... | |
| 265 | 266 | E09000003 | Barnet | 523473 | 191752 | -0.218200 | 51.61107 | 9c7bda3b-2831-4799-857a-c83a49d16e4e | POLYGON ((-0.19987 51.67017, -0.19107 51.6639,... | |
| 266 | 267 | E09000004 | Bexley | 549202 | 175434 | 0.146212 | 51.45823 | ce2684df-0b6a-45fb-a5f7-85b9ab35e0de | POLYGON ((0.18654 51.48046, 0.20084 51.47866, ... | |
| 267 | 268 | E09000005 | Brent | 519615 | 186465 | -0.275690 | 51.56439 | 3b5f8a90-cb62-4570-984f-fbf322e910bc | POLYGON ((-0.2495 51.58557, -0.25173 51.58338,... | |
| 268 | 269 | E09000006 | Bromley | 542036 | 165707 | 0.039246 | 51.37266 | 13f562be-b8cb-427c-bf86-0f845358e1ac | POLYGON ((0.03975 51.44098, 0.05821 51.42487, ... | |
| 269 | 270 | E09000007 | Camden | 527491 | 184283 | -0.162910 | 51.54305 | 88f3f650-53ac-434d-883e-6235b48d14c4 | POLYGON ((-0.13842 51.55687, -0.13072 51.55067... | |
| 270 | 271 | E09000008 | Croydon | 533922 | 164745 | -0.077620 | 51.36599 | c8fce222-bb0d-4a51-a695-68c991b41d31 | POLYGON ((-0.11263 51.42324, -0.10596 51.42259... | |
| 271 | 272 | E09000009 | Ealing | 517055 | 181959 | -0.314100 | 51.52443 | b2200ae0-425c-441f-88f8-b3ad3afc2152 | POLYGON ((-0.33556 51.55656, -0.31253 51.54903... | |
| 272 | 273 | E09000010 | Enfield | 532831 | 196198 | -0.081440 | 51.64890 | 53c0b28d-45c0-462e-96ff-dd23c7f8cead | POLYGON ((-0.08389 51.68991, -0.06209 51.68298... | |
| 273 | 274 | E09000011 | Greenwich | 542507 | 175878 | 0.050093 | 51.46394 | c559fa29-c300-4d10-8c1f-c0ea59815fa1 | MULTIPOLYGON (((-0.01733 51.4802, -0.01876 51.... | |
| 274 | 275 | E09000012 | Hackney | 534560 | 185787 | -0.060460 | 51.55493 | a41cf0fc-95ed-4823-a9db-7b8bc60f421b | POLYGON ((-0.01717 51.55158, -0.01655 51.54333... | |
| 275 | 276 | E09000013 | Hammersmith and Fulham | 523867 | 177993 | -0.217350 | 51.48733 | fb263bb7-9972-416c-9e8c-fc6966c08221 | POLYGON ((-0.21503 51.50219, -0.20795 51.49603... | |
| 276 | 277 | E09000014 | Haringey | 531260 | 189349 | -0.106700 | 51.58772 | a4407a53-4227-42ea-be81-5933bf73fc98 | POLYGON ((-0.11562 51.60842, -0.11445 51.6084,... | |
| 277 | 278 | E09000015 | Harrow | 515359 | 189736 | -0.335990 | 51.59467 | 72e2ba10-bf08-4504-8775-6fd08d725dd7 | POLYGON ((-0.2842 51.5905, -0.28246 51.58505, ... | |
| 278 | 279 | E09000016 | Havering | 555032 | 187514 | 0.235368 | 51.56520 | ef6d5be3-a2f9-4ad8-8c35-5cdc37e0bf9e | POLYGON ((0.22409 51.63174, 0.26326 51.60919, ... | |
| 279 | 280 | E09000017 | Hillingdon | 508168 | 183121 | -0.441790 | 51.53664 | ac5e6b28-6cc5-413b-b5b0-2d86048fe3f1 | POLYGON ((-0.45974 51.61316, -0.45713 51.61229... | |
| 280 | 281 | E09000018 | Hounslow | 512737 | 174959 | -0.378550 | 51.46239 | a53588f2-bf1c-41b5-83f7-d0e7419b35df | POLYGON ((-0.27418 51.49729, -0.26906 51.49403... | |
| 281 | 282 | E09000019 | Islington | 531160 | 184645 | -0.109900 | 51.54547 | 43a872fc-e401-45af-a2e6-3b2befc98837 | POLYGON ((-0.07768 51.54948, -0.07669 51.54609... | |
| 282 | 283 | E09000020 | Kensington and Chelsea | 525756 | 179054 | -0.189780 | 51.49645 | 29dc7a0f-a487-4156-a842-975bea2a337d | POLYGON ((-0.19991 51.51684, -0.19917 51.51454... | |
| 283 | 284 | E09000021 | Kingston upon Thames | 519508 | 167389 | -0.283680 | 51.39296 | 1e70f43d-4ddf-47dc-88bb-4df2efbfcafc | POLYGON ((-0.25424 51.4293, -0.24952 51.41478,... | |
| 284 | 285 | E09000022 | Lambeth | 531118 | 175629 | -0.113850 | 51.46445 | 9143bbc4-6bda-4e17-8148-cae2813f694d | POLYGON ((-0.09936 51.47264, -0.09598 51.46987... | |
| 285 | 286 | E09000023 | Lewisham | 537888 | 173343 | -0.017340 | 51.44230 | 99f48b46-bb26-47d7-8c90-bab1eeba4425 | POLYGON ((-0.01876 51.47891, -0.02274 51.47535... | |
| 286 | 287 | E09000024 | Merton | 526068 | 169508 | -0.188690 | 51.41059 | 2e85b11a-12d0-45c2-862c-ffd0dce83b1a | POLYGON ((-0.18985 51.44027, -0.18977 51.43135... | |
| 287 | 288 | E09000025 | Newham | 540713 | 183346 | 0.027261 | 51.53150 | 70e80979-1178-4228-9ae5-2e8d09e17e69 | POLYGON ((0.05034 51.56402, 0.06015 51.55641, ... | |
| 288 | 289 | E09000026 | Redbridge | 543512 | 189477 | 0.070085 | 51.58588 | 947652cf-8d04-498f-aeac-8cb34f1052d8 | POLYGON ((0.02182 51.62883, 0.04079 51.61573, ... | |
| 289 | 290 | E09000027 | Richmond upon Thames | 519005 | 172650 | -0.289140 | 51.44035 | 533a2a77-3779-4660-b9de-90d93540ee21 | POLYGON ((-0.23296 51.47168, -0.23357 51.46535... | |
| 290 | 291 | E09000028 | Southwark | 533945 | 175869 | -0.073090 | 51.46595 | 7d09d625-212a-4819-9b12-36c8d8d2b697 | POLYGON ((-0.07083 51.50252, -0.07351 51.50046... | |
| 291 | 292 | E09000029 | Sutton | 527357 | 163639 | -0.172270 | 51.35755 | 1b5d57bc-1cb5-43e1-85ba-3bb45da0b7cc | POLYGON ((-0.16995 51.39173, -0.1653 51.388, -... | |
| 292 | 293 | E09000030 | Tower Hamlets | 536340 | 181452 | -0.036480 | 51.51555 | 54f82e61-dc15-42cb-b777-681b4326f855 | POLYGON ((-0.03319 51.54469, -0.02899 51.54227... | |
| 293 | 294 | E09000031 | Waltham Forest | 537328 | 190278 | -0.018800 | 51.59462 | 97cd624f-2c02-4f12-a78e-7aedc777673f | POLYGON ((0.02009 51.62643, 0.01454 51.61889, ... | |
| 294 | 295 | E09000032 | Wandsworth | 525152 | 174138 | -0.200220 | 51.45240 | 9c51587a-4315-449d-bcc7-e65625e402f7 | POLYGON ((-0.14969 51.46129, -0.14814 51.4564,... | |
| 295 | 296 | E09000033 | Westminster | 528268 | 180871 | -0.152950 | 51.51222 | a52ff0f8-4ac0-4417-8338-01be3cbb516a | POLYGON ((-0.17348 51.53765, -0.1649 51.53578,... |
# We check the coordinate reference system (CRS) of the London local authority boundaries, which is WGS 84.
gdf_uk_london.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
# We can plot the London local authority boundaries using the plot method.
gdf_uk_london.plot(figsize=(6, 6), edgecolor='grey', color='skyblue', alpha=0.7, linewidth=0.4)
<Axes: >
# We read the ULEZ 2019 / CCZ boundary in shp format.
gdf_ccz = gpd.read_file("data/ULEZ_2019_Central_Congestion_Charging_Zone/UltraLowEmissionsZoneBoundary(ULEZ).shp")
# We can observe that the CCZ boundary has 1 row (CCZ is a single polygon).
gdf_ccz
| OBJECTID | BOUNDARY | Shape_Area | geometry | |
|---|---|---|---|---|
| 0 | 1 | CSS Area | 21.375571 | POLYGON ((531562.664 183054.181, 531582.254 18... |
# Check the CRS of the CCZ boundary, which is OSGB 1936 / British National Grid.
gdf_ccz.crs
<Projected CRS: EPSG:27700>
Name: OSGB36 / British National Grid
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: United Kingdom (UK) - offshore to boundary of UKCS within 49°45'N to 61°N and 9°W to 2°E; onshore Great Britain (England, Wales and Scotland). Isle of Man onshore.
- bounds: (-9.01, 49.75, 2.01, 61.01)
Coordinate Operation:
- name: British National Grid
- method: Transverse Mercator
Datum: Ordnance Survey of Great Britain 1936
- Ellipsoid: Airy 1830
- Prime Meridian: Greenwich
# Plot the CCZ boundary using the plot method.
gdf_ccz.plot(figsize=(6, 6), edgecolor='grey', color='red', alpha=0.3, linewidth=0.4)
<Axes: >
We need compatible CRS to perform overlay operations, which means that we should use map projection to change WGS 84 to OSGB 1936 / British National Grid. (We recommend using the projected CRS as the projected CRS is more accurate than the geographic CRS).
# Change London LA boundaries CRS from WGS 84 to OSGB 1936 / British National Grid.
gdf_uk_london = gdf_uk_london.to_crs(epsg=27700)
# We can check the CRS of the London local authority boundaries again.
gdf_uk_london.crs == gdf_ccz.crs
True
Intersection
In this case, we would like to create a new layer that contains only the areas of the LAs in London that are within the CCZ boundary. This is done by performing an intersection operation between the two geo-dfs (two layers).
# Perform intersection operation between London LA boundaries and CCZ boundary.
gdf_ccz_la = gpd.overlay(gdf_ccz, gdf_uk_london, how='intersection')
# We can observe that there are 8 LAs in London that are within the CCZ boundary (8 rows in GeoDataFrame), and we generate 8 new areas (zones).)
gdf_ccz_la
| OBJECTID | BOUNDARY | Shape_Area | FID | LAD24CD | LAD24NM | LAD24NMW | BNG_E | BNG_N | LONG | LAT | GlobalID | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | CSS Area | 21.375571 | 264 | E09000001 | City of London | 532382 | 181358 | -0.09352 | 51.51564 | 741710fd-03e1-4b41-8645-7ebcfc5961ac | POLYGON ((533741.949 181255.415, 533743.557 18... | |
| 1 | 1 | CSS Area | 21.375571 | 270 | E09000007 | Camden | 527491 | 184283 | -0.16291 | 51.54305 | 88f3f650-53ac-434d-883e-6235b48d14c4 | POLYGON ((528914.128 182173.503, 528917.252 18... | |
| 2 | 1 | CSS Area | 21.375571 | 275 | E09000012 | Hackney | 534560 | 185787 | -0.06046 | 51.55493 | a41cf0fc-95ed-4823-a9db-7b8bc60f421b | POLYGON ((532960.103 182538.598, 532960.597 18... | |
| 3 | 1 | CSS Area | 21.375571 | 282 | E09000019 | Islington | 531160 | 184645 | -0.10990 | 51.54547 | 43a872fc-e401-45af-a2e6-3b2befc98837 | POLYGON ((531582.254 183032.797, 531584.653 18... | |
| 4 | 1 | CSS Area | 21.375571 | 285 | E09000022 | Lambeth | 531118 | 175629 | -0.11385 | 51.46445 | 9143bbc4-6bda-4e17-8148-cae2813f694d | POLYGON ((531750.948 178636.445, 531734.903 17... | |
| 5 | 1 | CSS Area | 21.375571 | 291 | E09000028 | Southwark | 533945 | 175869 | -0.07309 | 51.46595 | 7d09d625-212a-4819-9b12-36c8d8d2b697 | POLYGON ((533584.147 180067.345, 533558.653 18... | |
| 6 | 1 | CSS Area | 21.375571 | 293 | E09000030 | Tower Hamlets | 536340 | 181452 | -0.03648 | 51.51555 | 54f82e61-dc15-42cb-b777-681b4326f855 | MULTIPOLYGON (((533533.852 182103.403, 533568.... | |
| 7 | 1 | CSS Area | 21.375571 | 296 | E09000033 | Westminster | 528268 | 180871 | -0.15295 | 51.51222 | a52ff0f8-4ac0-4417-8338-01be3cbb516a | POLYGON ((530031.102 178265.801, 530007.101 17... |
# We can plot the new 8 layers using the plot method.
# We can observe that some LA boundaries have been cut by the CCZ boundary.
gdf_ccz_la.plot(figsize=(6, 6), edgecolor='grey', color='orange', alpha=0.7, linewidth=0.4)
<Axes: >
Union
In this case, we would like to create a new union layer that consists of Camden and CCZ. This is done by performing a union operation between the two geo-dfs (two layers).
# Select Camden LA boundary from the London LA boundaries.
gdf_camden = gdf_uk_london[gdf_uk_london['LAD24NM'] == 'Camden']
# plot the Camden LA boundary.
gdf_camden.plot(figsize=(6, 6), edgecolor='grey', color='skyblue', alpha=0.7, linewidth=0.4)
<Axes: >
# Perform union operation between Camden LA boundary and CCZ boundary.
gdf_union = gpd.overlay(gdf_camden, gdf_ccz, how='union')
# We can observe that there are 3 rows in the new GeoDataFrame.
gdf_union
| FID | LAD24CD | LAD24NM | LAD24NMW | BNG_E | BNG_N | LONG | LAT | GlobalID | OBJECTID | BOUNDARY | Shape_Area | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 270.0 | E09000007 | Camden | 527491.0 | 184283.0 | -0.16291 | 51.54305 | 88f3f650-53ac-434d-883e-6235b48d14c4 | 1.0 | CSS Area | 21.375571 | POLYGON ((530918.69 182415.114, 531558.692 181... | |
| 1 | 270.0 | E09000007 | Camden | 527491.0 | 184283.0 | -0.16291 | 51.54305 | 88f3f650-53ac-434d-883e-6235b48d14c4 | NaN | NaN | NaN | POLYGON ((529703.128 185186.98, 529820.513 184... | |
| 2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | CSS Area | 21.375571 | POLYGON ((531582.254 183032.797, 531584.653 18... |
We can observe that the new union zone consists of three parts/ areas:
Camden area that is not in CCZ.
CCZ that is not in Camden LA boundary.
The intersected area that is both in Camden LA boundary and CCZ boundary.
# plot each area in the new union zone.
for i in range(len(gdf_union)):
gdf_union.iloc[[i]].plot(figsize=(3, 3), edgecolor='grey', color='orange', alpha=0.7, linewidth=0.4)
Symmetric difference
In this case, we would like to create a new layer that consists of the areas of Camden and CCZ that are not in common. This is done by performing a symmetric difference operation between the two geo-dfs (two layers).
# Perform a symmetric difference operation between Camden LA boundary and CCZ boundary.
gdf_sym_diff = gpd.overlay(gdf_camden, gdf_ccz, how='symmetric_difference')
# We can observe that there are 2 rows in the new GeoDataFrame.
gdf_sym_diff
| FID | LAD24CD | LAD24NM | LAD24NMW | BNG_E | BNG_N | LONG | LAT | GlobalID | OBJECTID | BOUNDARY | Shape_Area | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 270.0 | E09000007 | Camden | 527491.0 | 184283.0 | -0.16291 | 51.54305 | 88f3f650-53ac-434d-883e-6235b48d14c4 | NaN | NaN | NaN | POLYGON ((529703.128 185186.98, 529820.513 184... | |
| 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | CSS Area | 21.375571 | POLYGON ((531582.254 183032.797, 531584.653 18... |
We can observe that the new zone consists of two parts/ areas:
Camden area that is not in CCZ.
CCZ area that is not in Camden LA boundary.
This means that symmetric difference only returns the two areas (which dorpping the intersected area) when compared to the union operation.
# plot each area in the new zone.
for i in range(len(gdf_sym_diff)):
gdf_sym_diff.iloc[[i]].plot(figsize=(3, 3), edgecolor='grey', color='orange', alpha=0.7, linewidth=0.4)
Difference
In this case, we would like to create a new layer that consists of the areas of Camden that are not in CCZ. This is done by performing a different operation between the two geo-dfs (two layers).
# Perform a difference operation between Camden LA boundary and CCZ boundary.
gdf_diff_1 = gpd.overlay(gdf_camden, gdf_ccz, how='difference')
# plot the new zone.
gdf_diff_1.plot(figsize=(3, 3), edgecolor='grey', color='orange', alpha=0.7, linewidth=0.4)
<Axes: >
If we change the order of the gdfs in the fuction, we will get another new layrer that consists of the areas of CCZ that are not in Camden.
# Perform a difference operation between CCZ boundary and Camden LA boundary.
gdf_diff_2 = gpd.overlay(gdf_ccz, gdf_camden, how='difference')
# plot the new zone.
gdf_diff_2.plot(figsize=(3, 3), edgecolor='grey', color='orange', alpha=0.7, linewidth=0.4)
<Axes: >
1.4 Spatial Join#
Like the join function in Pandas, Geopandas Spatial join is a process of combining two GeoDataFrames based on their spatial relationship. This is useful for analyzing the relationship between different layers and types of geospatial data. For example, we can use spatial join to combine a large dataset of points with polygons to find out which points are within which polygons. As part of data meraging techniques in Geopandas, you can find more info on this page. Please note that the spatial join is different from the overlay operation, which creates a new layer based on the intersection of two layers. The spatial join only combines the attributes of the two layers based on their spatial relationship.
The default spatial index in GeoPandas currently supports the following values for predicate which are defined in the Shapely documentation:
intersects: returns all geometries that intersect with the other geometry.
contains: returns all geometries that contain the other geometry.
within: returns all geometries that are within the other geometry. The within predicate is the inverse of contains.
crosses: returns all geometries that cross the other geometry. This means that the geometries share some but not all points in common.
touches: returns all geometries that touch the other geometry. This means include geometries that are adjacent to the other geometry.
Here, we will use the London dataset use to demonstrate the spatial join operation. First, we will perform a spatial join between the CCZ zone and London Underground stations to find out which underground stations are within the CCZ zone.
# We read the London Underground station locations in geojson format.
gdf_underground = gpd.read_file("data/Underground_Stations.geojson")
# We need to transfer the CRS of the underground stations from WGS 84 to OSGB 1936 / British National Grid.
gdf_underground = gdf_underground.to_crs(epsg=27700)
# Perform a spatial join between London Underground stations and CCZ zone.
gdf_ccz_underground = gpd.sjoin(gdf_underground, gdf_ccz, how='inner', predicate='within')
Please note that ‘how’ parameter is set to ‘inner,’ which means that only the stations that are within the CCZ area will be returned.
The predicate parameter is set to within which means that only the stations (left gdf) that are within the CCZ (right gdf) will be returned.
If we set the how parameter to left, all the stations will be returned, but only the stations that are within the CCZ will have the CCZ attributes.
If we set the how parameter to right, all the CCZ zones will be returned, but only the CCZ that has the underground stations will have the underground station attributes.
# We can observe that there are 37 underground stations that are within the CCZ zone.
gdf_ccz_underground
| OBJECTID_left | NAME | LINES | ATCOCODE | MODES | ACCESSIBILITY | NIGHT_TUBE | NETWORK | DATASET_LAST_UPDATED | FULL_NAME | geometry | index_right | OBJECTID_right | BOUNDARY | Shape_Area | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 111 | St. Paul's | Central | 940GZZLUSPU | bus, tube | Not Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | St. Paul's station | POINT Z (532110.145 181274.419 0) | 0 | 1 | CSS Area | 21.375571 |
| 36 | 147 | Leicester Square | Piccadilly, Northern | 940GZZLULSQ | tube, bus | Not Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Leicester Square station | POINT Z (529982.188 180824.696 0) | 0 | 1 | CSS Area | 21.375571 |
| 37 | 148 | Covent Garden | Piccadilly | 940GZZLUCGN | tube | Not Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Covent Garden station | POINT Z (530252.371 181025.223 0) | 0 | 1 | CSS Area | 21.375571 |
| 38 | 149 | Russell Square | Piccadilly | 940GZZLURSQ | tube, bus | Not Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Russell Square station | POINT Z (530230.87 182127.97 0) | 0 | 1 | CSS Area | 21.375571 |
| 46 | 157 | Temple | District, Circle | 940GZZLUTMP | tube, bus | Not Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Temple station | POINT Z (530960.433 180803.046 0) | 0 | 1 | CSS Area | 21.375571 |
| 47 | 158 | Blackfriars | District, Circle | 940GZZLUBKF | tube | Fully Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Blackfriars station | POINT Z (531695.661 180893.308 0) | 0 | 1 | CSS Area | 21.375571 |
| 48 | 159 | Mansion House | District, Circle | 940GZZLUMSH | tube, bus | Not Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Mansion House station | POINT Z (532355.962 180931.785 0) | 0 | 1 | CSS Area | 21.375571 |
| 49 | 160 | Cannon Street | District, Circle | 940GZZLUCST | tube | Partially Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Cannon Street station | POINT Z (532613.273 180900.285 0) | 0 | 1 | CSS Area | 21.375571 |
| 50 | 161 | Tower Hill | District, Circle | 940GZZLUTWH | tube | Not Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Tower Hill station | POINT Z (533581.262 180755.588 0) | 0 | 1 | CSS Area | 21.375571 |
| 51 | 162 | Aldgate | Metropolitan, Circle | 940GZZLUALD | bus, tube | Not Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Aldgate station | POINT Z (533614.972 181262.505 0) | 0 | 1 | CSS Area | 21.375571 |
| 52 | 163 | Liverpool Street | Metropolitan, Central, Circle, Hammersmith & City | 940GZZLULVT | tube | Partially Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Liverpool Street station | POINT Z (533095.667 181567.145 0) | 0 | 1 | CSS Area | 21.375571 |
| 53 | 164 | Moorgate | Metropolitan, Northern, Circle, Hammersmith & ... | 940GZZLUMGT | tube | Not Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Moorgate station | POINT Z (532669.633 181668.478 0) | 0 | 1 | CSS Area | 21.375571 |
| 54 | 165 | Barbican | Metropolitan, Circle, Hammersmith & City | 940GZZLUBBN | tube, bus | Not Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Barbican station | POINT Z (532006.001 181856.574 0) | 0 | 1 | CSS Area | 21.375571 |
| 55 | 166 | Farringdon | Metropolitan, Circle, Hammersmith & City | 940GZZLUFCN | tube | Fully Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Farringdon station | POINT Z (531561.791 181874.173 0) | 0 | 1 | CSS Area | 21.375571 |
| 78 | 189 | Warren Street | Northern, Victoria | 940GZZLUWRR | tube, bus | Not Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Warren Street station | POINT Z (529250.51 182266.279 0) | 0 | 1 | CSS Area | 21.375571 |
| 92 | 203 | Marble Arch | Central | 940GZZLUMBA | tube, bus | Not Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Marble Arch station | POINT Z (527961.86 181017.689 0) | 0 | 1 | CSS Area | 21.375571 |
| 93 | 204 | Bond Street | Central, Jubilee | 940GZZLUBND | tube, bus | Fully Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Bond Street station | POINT Z (528492.413 181117.852 0) | 0 | 1 | CSS Area | 21.375571 |
| 98 | 209 | Bank | Waterloo & City, Northern, Central | 940GZZLUBNK | tube | Not Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Bank station | POINT Z (532711.957 181120.121 0) | 0 | 1 | CSS Area | 21.375571 |
| 99 | 210 | Oxford Circus | Central, Bakerloo, Victoria | 940GZZLUOXC | tube, bus | Partially Accessible - Interchange Only | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Oxford Circus station | POINT Z (529048.363 181236.558 0) | 0 | 1 | CSS Area | 21.375571 |
| 100 | 211 | Holborn | Central, Piccadilly | 940GZZLUHBN | tube, bus | Not Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Holborn station | POINT Z (530513.836 181525.303 0) | 0 | 1 | CSS Area | 21.375571 |
| 101 | 212 | Chancery Lane | Central | 940GZZLUCHL | tube, bus | Not Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Chancery Lane station | POINT Z (531125.73 181615.704 0) | 0 | 1 | CSS Area | 21.375571 |
| 118 | 229 | Goodge Street | Northern | 940GZZLUGDG | tube, bus | Not Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Goodge Street station | POINT Z (529539.037 181836.829 0) | 0 | 1 | CSS Area | 21.375571 |
| 119 | 230 | Tottenham Court Road | Central, Northern | 940GZZLUTCR | bus, tube | Fully Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Tottenham Court Road station | POINT Z (529817.144 181382.723 0) | 0 | 1 | CSS Area | 21.375571 |
| 148 | 259 | Embankment | District, Bakerloo, Northern, Circle | 940GZZLUEMB | tube | Not Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Embankment station | POINT Z (530421.085 180396.077 0) | 0 | 1 | CSS Area | 21.375571 |
| 157 | 268 | Piccadilly Circus | Bakerloo, Piccadilly | 940GZZLUPCC | bus, tube | Not Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Piccadilly Circus station | POINT Z (529614.395 180665.739 0) | 0 | 1 | CSS Area | 21.375571 |
| 158 | 269 | Charing Cross | Bakerloo, Northern | 940GZZLUCHX | tube | Fully Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Charing Cross station | POINT Z (530059.453 180378.586 0) | 0 | 1 | CSS Area | 21.375571 |
| 159 | 270 | Elephant & Castle | Northern, Bakerloo | 940GZZLUEAC | tube | Partially Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Elephant & Castle station | POINT Z (531910.732 179142.068 0) | 0 | 1 | CSS Area | 21.375571 |
| 160 | 271 | Lambeth North | Bakerloo | 940GZZLULBN | bus, tube | Not Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Lambeth North station | POINT Z (531135.34 179456.379 0) | 0 | 1 | CSS Area | 21.375571 |
| 172 | 283 | Westminster | District, Circle, Jubilee | 940GZZLUWSM | tube | Fully Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Westminster station | POINT Z (530197.419 179668.415 0) | 0 | 1 | CSS Area | 21.375571 |
| 175 | 286 | Waterloo | Waterloo & City, Bakerloo, Northern, Jubilee | 940GZZLUWLO | tube | Partially Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Waterloo station | POINT Z (530969.943 179962.112 0) | 0 | 1 | CSS Area | 21.375571 |
| 182 | 293 | Green Park | Piccadilly, Victoria, Jubilee | 940GZZLUGPK | bus, tube | Fully Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Green Park station | POINT Z (529008.967 180295.035 0) | 0 | 1 | CSS Area | 21.375571 |
| 222 | 333 | Southwark | Jubilee | 940GZZLUSWK | bus, tube | Fully Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | Southwark station | POINT Z (531594.114 180074.192 0) | 0 | 1 | CSS Area | 21.375571 |
| 223 | 334 | London Bridge | Northern, Jubilee | 940GZZLULNB | tube | Fully Accessible | Yes | London Underground | 2021-11-29 00:00:00+00:00 | London Bridge station | POINT Z (532684.838 180189.059 0) | 0 | 1 | CSS Area | 21.375571 |
| 231 | 387 | Monument | District, Circle | 940GZZLUMMT | tube, bus | Partially Accessible - Interchange Only | No | London Underground | 2021-11-29 00:00:00+00:00 | Monument station | POINT Z (532913.72 180824.294 0) | 0 | 1 | CSS Area | 21.375571 |
| 242 | 398 | St. James's Park | District, Circle | 940GZZLUSJP | tube | Not Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | St. James's Park station | POINT Z (529648.342 179498.258 0) | 0 | 1 | CSS Area | 21.375571 |
| 264 | 479 | Borough | Northern | 940GZZLUBOR | bus, tube | Partially Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Borough station | POINT Z (532441.745 179751.658 0) | 0 | 1 | CSS Area | 21.375571 |
| 265 | 480 | Old Street | Northern | 940GZZLUODS | tube | Not Accessible | No | London Underground | 2021-11-29 00:00:00+00:00 | Old Street station | POINT Z (532766.27 182419.055 0) | 0 | 1 | CSS Area | 21.375571 |
# We can plot the underground stations and the CCZ zone.
fig, ax = plt.subplots(figsize=(6, 6))
gdf_ccz.plot(ax=ax, edgecolor='black', color='#fff2f2', alpha=1, linewidth=0.6)
gdf_ccz_underground.plot(ax=ax, edgecolor='grey', color='royalblue', alpha=1, linewidth=0.4, markersize=15)
ax.set_axis_off()
Second, we can also perform a spatial join between the London cycle routes and the CCZ to find out which cycle routes are within the CCZ zone.
# Read the London cycle routes in geojson format.
gdf_cycle_routes = gpd.read_file("data/Cycle_Routes.geojson")
# We transfer the CRS of the cycle routes from WGS 84 to OSGB 1936 / British National Grid.
gdf_cycle_routes = gdf_cycle_routes.to_crs(epsg=27700)
# Perform a spatial join between London cycle routes and CCZ zone.
gdf_ccz_cycle_routes = gpd.sjoin(gdf_cycle_routes, gdf_ccz, how='inner', predicate='within')
# We can observe that there are 13 cycle routes that are within the CCZ zone.
gdf_ccz_cycle_routes
| LABEL | PROGRAMME | ROUTE_NAME | ROUTE | MILESTONE | STATUS | PUBLIC_ | ROUTE_LENGTH_KM | PROGRAMME_UPDATED | OBJECTID_left | Shape__Length | geometry | index_right | OBJECTID_right | BOUNDARY | Shape_Area | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | C41 | Cycleways | Euston to Holborn | Euston to Holborn | Complete | Complete | Yes | 1.181 | 2023-11-21 00:00:00+00:00 | 1995 | 1180.878708 | LINESTRING Z (530257.989 182468.202 0, 530288.... | 0 | 1 | CSS Area | 21.375571 |
| 11 | C56 | Cycleways | C5 to Westmister Bridge | C5 to Westmister Bridge | Complete | Complete | Yes | 1.196 | 2023-11-21 00:00:00+00:00 | 2002 | 1196.244254 | MULTILINESTRING Z ((530954.753 179173.18 0, 53... | 0 | 1 | CSS Area | 21.375571 |
| 26 | C | Cycle Superhighways | Lancaster Gate to Barking | Horse Guards Road | Complete | Complete | Yes | 1.336 | 2023-11-21 00:00:00+00:00 | 2017 | 1336.495364 | LINESTRING Z (529903.742 179698.188 0, 529903.... | 0 | 1 | CSS Area | 21.375571 |
| 40 | C | Cycleways | Lambeth Roundabout to P.Square | Lambeth Roundabout to P.Square | (SG3) Concept Design | Feasibility | Yes | 0.598 | 2023-11-21 00:00:00+00:00 | 2031 | 598.403380 | LINESTRING Z (530155.358 179582.367 0, 530160.... | 0 | 1 | CSS Area | 21.375571 |
| 42 | C11 | Cycleways | Essex Road to Farringdon | The City to Farringdon | Complete | Complete | Yes | 1.243 | 2023-11-21 00:00:00+00:00 | 2033 | 1243.120260 | LINESTRING Z (532591.297 181938.706 0, 532574.... | 0 | 1 | CSS Area | 21.375571 |
| 50 | C | Cycleways | C1 to Liverpool Street | C1 to Liverpool Street | Complete | Complete | Yes | 0.525 | 2023-11-21 00:00:00+00:00 | 2041 | 524.539812 | LINESTRING Z (532944.816 181895.234 0, 533036.... | 0 | 1 | CSS Area | 21.375571 |
| 73 | C | Cycleways | Lambeth Roundabout - North & South | Lambeth Roundabout - North & South | (SG4) Detailed Design | Feasibility | Yes | 0.519 | 2023-11-21 00:00:00+00:00 | 2064 | 519.098639 | LINESTRING Z (530531.437 178939.038 0, 530531.... | 0 | 1 | CSS Area | 21.375571 |
| 103 | C | Central London Grid | Fitzrovia to Soho | Fitzrovia to Soho | (SG4) Detailed Design | Feasibility | Yes | 0.942 | 2023-11-21 00:00:00+00:00 | 2094 | 942.260282 | LINESTRING Z (529477.755 181246.222 25.5, 5295... | 0 | 1 | CSS Area | 21.375571 |
| 110 | C | Central London Grid | Fitzrovia to Soho | Fitzrovia to Soho | (SG4) Detailed Design | Feasibility | Yes | 0.365 | 2023-11-21 00:00:00+00:00 | 2101 | 365.025555 | LINESTRING Z (529360.759 181642.231 28.4, 5293... | 0 | 1 | CSS Area | 21.375571 |
| 117 | C | Cycleways | C4 to C14 and C10 | C4 to C14 and C10 | (SG4) Detailed Design | Feasibility | Yes | 0.377 | 2023-11-21 00:00:00+00:00 | 2108 | 377.333957 | LINESTRING Z (533001.786 179343.19 0, 532949.3... | 0 | 1 | CSS Area | 21.375571 |
| 118 | C | Cycleways | Waterloo Bridge | Waterloo Bridge | Complete | Complete | Yes | 0.491 | 2023-11-21 00:00:00+00:00 | 2109 | 490.560628 | LINESTRING Z (530690.767 180675.212 0, 530941.... | 0 | 1 | CSS Area | 21.375571 |
| 138 | C | Cycleways | Old Paradise Street | Old Paradise Street | Complete | Complete | Yes | 0.317 | 2023-11-21 00:00:00+00:00 | 2129 | 316.683268 | LINESTRING Z (530822.748 178867.173 0, 530814.... | 0 | 1 | CSS Area | 21.375571 |
| 161 | C10 | Cycleways | Blomsbury to Embankment | Blomsbury to Embankment | Complete | Complete | Yes | 1.421 | 2023-11-21 00:00:00+00:00 | 2152 | 1420.613731 | MULTILINESTRING Z ((530661.767 180658.212 0, 5... | 0 | 1 | CSS Area | 21.375571 |
# We can plot the cycle routes and the CCZ zone.
fig, ax = plt.subplots(figsize=(6, 6))
gdf_ccz.plot(ax=ax, edgecolor='black', color='#fff2f2', alpha=1, linewidth=0.6)
gdf_ccz_cycle_routes.plot(ax=ax, edgecolor='grey', color='royalblue', alpha=1, linewidth=2)
ax.set_axis_off()
Third, we can also perform a spatial join between the London cycle routes and the CCZ to find out which cycle routes are intersected with the CCZ zone.
# Perform a spatial join between London cycle routes and CCZ zone.
gdf_ccz_cycle_routes_intersect = gpd.sjoin(gdf_cycle_routes, gdf_ccz, how='inner', predicate='intersects')
# We can observe that there are 33 cycle routes that are intersected with the CCZ zone.
gdf_ccz_cycle_routes_intersect
| LABEL | PROGRAMME | ROUTE_NAME | ROUTE | MILESTONE | STATUS | PUBLIC_ | ROUTE_LENGTH_KM | PROGRAMME_UPDATED | OBJECTID_left | Shape__Length | geometry | index_right | OBJECTID_right | BOUNDARY | Shape_Area | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | C38 | Cycleways | Finsbury Park to Highbury Fields | Islington to Finsbury | (SG2) Option Selection | Feasibility | Yes | 0.566 | 2023-11-21 00:00:00+00:00 | 1991 | 565.899062 | LINESTRING Z (531184.797 182813.267 0, 531180.... | 0 | 1 | CSS Area | 21.375571 |
| 4 | C41 | Cycleways | Euston to Holborn | Euston to Holborn | Complete | Complete | Yes | 1.181 | 2023-11-21 00:00:00+00:00 | 1995 | 1180.878708 | LINESTRING Z (530257.989 182468.202 0, 530288.... | 0 | 1 | CSS Area | 21.375571 |
| 11 | C56 | Cycleways | C5 to Westmister Bridge | C5 to Westmister Bridge | Complete | Complete | Yes | 1.196 | 2023-11-21 00:00:00+00:00 | 2002 | 1196.244254 | MULTILINESTRING Z ((530954.753 179173.18 0, 53... | 0 | 1 | CSS Area | 21.375571 |
| 15 | C6 | Cycleways | Elephant and Castle to Hampstead | Elephant and Castle to Kings Cross | Complete | Complete | Yes | 7.944 | 2023-11-21 00:00:00+00:00 | 2006 | 7944.208889 | MULTILINESTRING Z ((531908.906 179052.174 0, 5... | 0 | 1 | CSS Area | 21.375571 |
| 21 | CS7 | Cycleways | Merton to The City | Merton to The City | Complete | Complete | Yes | 12.525 | 2023-11-21 00:00:00+00:00 | 2012 | 12525.201532 | MULTILINESTRING Z ((526717.872 170267.511 0, 5... | 0 | 1 | CSS Area | 21.375571 |
| 22 | C3 | Cycleways | Lancaster Gate to Barking | Lancaster Gate to Barking | Complete | Complete | Yes | 22.425 | 2023-11-21 00:00:00+00:00 | 2013 | 22425.196420 | LINESTRING Z (545215.501 183267.379 0, 545183.... | 0 | 1 | CSS Area | 21.375571 |
| 26 | C | Cycle Superhighways | Lancaster Gate to Barking | Horse Guards Road | Complete | Complete | Yes | 1.336 | 2023-11-21 00:00:00+00:00 | 2017 | 1336.495364 | LINESTRING Z (529903.742 179698.188 0, 529903.... | 0 | 1 | CSS Area | 21.375571 |
| 27 | C8 | Cycleways | Wandsworth to Lambeth Bridge | Wandsworth to Battersea Park | Complete | Complete | Yes | 3.288 | 2023-11-21 00:00:00+00:00 | 2018 | 3284.253577 | MULTILINESTRING Z ((530246.431 178963.826 0, 5... | 0 | 1 | CSS Area | 21.375571 |
| 35 | C5 | Cycleways | Waterloo to Clapham | Waterloo to Clapham | Complete | Complete | Yes | 2.076 | 2023-11-21 00:00:00+00:00 | 2026 | 2076.093601 | LINESTRING Z (530519.358 178020.846 0, 530521.... | 0 | 1 | CSS Area | 21.375571 |
| 40 | C | Cycleways | Lambeth Roundabout to P.Square | Lambeth Roundabout to P.Square | (SG3) Concept Design | Feasibility | Yes | 0.598 | 2023-11-21 00:00:00+00:00 | 2031 | 598.403380 | LINESTRING Z (530155.358 179582.367 0, 530160.... | 0 | 1 | CSS Area | 21.375571 |
| 41 | C | Cycleways | Marylbone Road to Oxford Street | Marylbone Road to Oxford Street | (SG3) Concept Design | Feasibility | Yes | 0.905 | 2023-11-21 00:00:00+00:00 | 2032 | 905.142174 | MULTILINESTRING Z ((528934.12 181516.815 0, 52... | 0 | 1 | CSS Area | 21.375571 |
| 42 | C11 | Cycleways | Essex Road to Farringdon | The City to Farringdon | Complete | Complete | Yes | 1.243 | 2023-11-21 00:00:00+00:00 | 2033 | 1243.120260 | LINESTRING Z (532591.297 181938.706 0, 532574.... | 0 | 1 | CSS Area | 21.375571 |
| 49 | C | Cycleways | Waterloo to Clapham | C5 to C14 - The Cut_Union Street | Complete | Complete | Yes | 2.224 | 2023-11-21 00:00:00+00:00 | 2040 | 2224.308731 | MULTILINESTRING Z ((531314.766 179849.195 0, 5... | 0 | 1 | CSS Area | 21.375571 |
| 50 | C | Cycleways | C1 to Liverpool Street | C1 to Liverpool Street | Complete | Complete | Yes | 0.525 | 2023-11-21 00:00:00+00:00 | 2041 | 524.539812 | LINESTRING Z (532944.816 181895.234 0, 533036.... | 0 | 1 | CSS Area | 21.375571 |
| 54 | C52 | Quietways | Covent Garden to Euston | Covent Garden to Euston | Complete | Complete | Yes | 2.561 | 2023-11-21 00:00:00+00:00 | 2045 | 2560.879009 | MULTILINESTRING Z ((530690.767 180675.212 0, 5... | 0 | 1 | CSS Area | 21.375571 |
| 55 | C4 | Cycleways | London Bridge to Rotherhithe Roundabout | London Bridge to Rotherhithe Roundabout | Complete | Complete | Yes | 2.998 | 2023-11-21 00:00:00+00:00 | 2046 | 2997.526255 | MULTILINESTRING Z ((533884.72 179664.948 0, 53... | 0 | 1 | CSS Area | 21.375571 |
| 73 | C | Cycleways | Lambeth Roundabout - North & South | Lambeth Roundabout - North & South | (SG4) Detailed Design | Feasibility | Yes | 0.519 | 2023-11-21 00:00:00+00:00 | 2064 | 519.098639 | LINESTRING Z (530531.437 178939.038 0, 530531.... | 0 | 1 | CSS Area | 21.375571 |
| 77 | C1 | Cycleways | Freezey Water to The City | Freezey Water to The City | Complete | Complete | Yes | 20.165 | 2023-11-21 00:00:00+00:00 | 2068 | 20164.517337 | MULTILINESTRING Z ((533373.843 185167.703 0, 5... | 0 | 1 | CSS Area | 21.375571 |
| 82 | C17 | Cycleways | Elephant and Castle to Dulwich | Elephant and Castle to Camberwell | Complete | Complete | Yes | 4.002 | 2023-11-21 00:00:00+00:00 | 2073 | 4001.525575 | MULTILINESTRING Z ((533054.791 176407.129 0, 5... | 0 | 1 | CSS Area | 21.375571 |
| 83 | C14 | Cycleways | Blackfriars to Rotherhithe | Blackfriars to Rotherhithe | Complete | Complete | Yes | 6.995 | 2023-11-21 00:00:00+00:00 | 2074 | 6994.926446 | LINESTRING Z (536853.163 178583.076 0, 536808.... | 0 | 1 | CSS Area | 21.375571 |
| 103 | C | Central London Grid | Fitzrovia to Soho | Fitzrovia to Soho | (SG4) Detailed Design | Feasibility | Yes | 0.942 | 2023-11-21 00:00:00+00:00 | 2094 | 942.260282 | LINESTRING Z (529477.755 181246.222 25.5, 5295... | 0 | 1 | CSS Area | 21.375571 |
| 109 | C51 | Central London Grid | Marylebone to Kilburn | Marylebone to Kilburn | (SG4) Detailed Design | Feasibility | Yes | 3.296 | 2023-11-21 00:00:00+00:00 | 2100 | 3266.659774 | MULTILINESTRING Z ((525974.729 183317.261 0, 5... | 0 | 1 | CSS Area | 21.375571 |
| 110 | C | Central London Grid | Fitzrovia to Soho | Fitzrovia to Soho | (SG4) Detailed Design | Feasibility | Yes | 0.365 | 2023-11-21 00:00:00+00:00 | 2101 | 365.025555 | LINESTRING Z (529360.759 181642.231 28.4, 5293... | 0 | 1 | CSS Area | 21.375571 |
| 113 | C27 | Quietways | East Acton to Walthamstow | East Acton to Walthamstow | Complete | Complete | Yes | 8.679 | 2023-11-21 00:00:00+00:00 | 2104 | 8678.992253 | MULTILINESTRING Z ((530917.187 182415.036 0, 5... | 0 | 1 | CSS Area | 21.375571 |
| 114 | C27 | Cycleways | East Acton to Walthamstow | East Acton to Walthamstow | Complete | Complete | Yes | 16.363 | 2023-11-21 00:00:00+00:00 | 2105 | 16363.310394 | MULTILINESTRING Z ((525557.685 180543.199 0, 5... | 0 | 1 | CSS Area | 21.375571 |
| 117 | C | Cycleways | C4 to C14 and C10 | C4 to C14 and C10 | (SG4) Detailed Design | Feasibility | Yes | 0.377 | 2023-11-21 00:00:00+00:00 | 2108 | 377.333957 | LINESTRING Z (533001.786 179343.19 0, 532949.3... | 0 | 1 | CSS Area | 21.375571 |
| 118 | C | Cycleways | Waterloo Bridge | Waterloo Bridge | Complete | Complete | Yes | 0.491 | 2023-11-21 00:00:00+00:00 | 2109 | 490.560628 | LINESTRING Z (530690.767 180675.212 0, 530941.... | 0 | 1 | CSS Area | 21.375571 |
| 120 | C | Cycleways | Oval to C5 | Oval to C5 | Complete | Complete | Yes | 1.032 | 2023-11-21 00:00:00+00:00 | 2111 | 1032.298014 | LINESTRING Z (530701.744 178648.168 0, 530746.... | 0 | 1 | CSS Area | 21.375571 |
| 126 | C10 | Cycleways | Waterloo to Greenwich | Waterloo to Greenwich | Complete | Complete | Yes | 10.543 | 2023-11-21 00:00:00+00:00 | 2117 | 10542.684407 | MULTILINESTRING Z ((534882.799 178422.148 0, 5... | 0 | 1 | CSS Area | 21.375571 |
| 138 | C | Cycleways | Old Paradise Street | Old Paradise Street | Complete | Complete | Yes | 0.317 | 2023-11-21 00:00:00+00:00 | 2129 | 316.683268 | LINESTRING Z (530822.748 178867.173 0, 530814.... | 0 | 1 | CSS Area | 21.375571 |
| 139 | C11 | Cycleways | Essex Road to Farringdon | Essex Road to The City | Complete | Complete | Yes | 2.123 | 2023-11-21 00:00:00+00:00 | 2130 | 2123.492454 | LINESTRING Z (531860.953 183765.955 0, 532108.... | 0 | 1 | CSS Area | 21.375571 |
| 145 | C13 | Cycleways | Old Street to London Fields | Old Street to London Fields | Complete | Complete | Yes | 3.184 | 2023-11-21 00:00:00+00:00 | 2136 | 3184.007679 | MULTILINESTRING Z ((532567.063 182352.813 0, 5... | 0 | 1 | CSS Area | 21.375571 |
| 161 | C10 | Cycleways | Blomsbury to Embankment | Blomsbury to Embankment | Complete | Complete | Yes | 1.421 | 2023-11-21 00:00:00+00:00 | 2152 | 1420.613731 | MULTILINESTRING Z ((530661.767 180658.212 0, 5... | 0 | 1 | CSS Area | 21.375571 |
We can observe that some cycle routes are not within the CCZ zone but they are intersected with the CCZ zone. This is because some cycle routes are represented by multilinestrings (even they are not connected).
# We can plot the cycle routes and the CCZ zone.
fig, ax = plt.subplots(figsize=(6, 6))
gdf_ccz.plot(ax=ax, edgecolor='black', color='#fff2f2', alpha=1, linewidth=0.6)
gdf_ccz_cycle_routes_intersect.plot(ax=ax, edgecolor='grey', color='royalblue', alpha=1, linewidth=1)
ax.set_axis_off()
Now, we only select the linestring from the multilinestring in cycle route then using spatial joins (intersect) with the CCZ.
# Select the linestring from the multilinestring in the cycle route.
gdf_cycle_routes_s = gdf_cycle_routes[gdf_cycle_routes['geometry'].type == 'LineString']
gdf_cycle_routes_s
| LABEL | PROGRAMME | ROUTE_NAME | ROUTE | MILESTONE | STATUS | PUBLIC_ | ROUTE_LENGTH_KM | PROGRAMME_UPDATED | OBJECTID | Shape__Length | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | C38 | Cycleways | Finsbury Park to Highbury Fields | Islington to Finsbury | (SG2) Option Selection | Feasibility | Yes | 0.566 | 2023-11-21 00:00:00+00:00 | 1991 | 565.899062 | LINESTRING Z (531184.797 182813.267 0, 531180.... |
| 1 | C48 | Cycleways | Brixton to Clapham High Street | Brixton to Clapham High Street | Complete | Complete | Yes | 1.315 | 2023-11-21 00:00:00+00:00 | 1992 | 1315.182747 | LINESTRING Z (531061.018 175496.889 0, 531070.... |
| 4 | C41 | Cycleways | Euston to Holborn | Euston to Holborn | Complete | Complete | Yes | 1.181 | 2023-11-21 00:00:00+00:00 | 1995 | 1180.878708 | LINESTRING Z (530257.989 182468.202 0, 530288.... |
| 5 | C55 | Cycleways | Lancaster Gate to Hyde Park Corner | Lancaster Gate to Hyde Park Corner | Complete | Complete | Yes | 0.723 | 2023-11-21 00:00:00+00:00 | 1996 | 723.280344 | LINESTRING Z (526927.71 180815.208 0, 526937.7... |
| 7 | C49 | Cycleways | Acton to Chiswick | Acton to Chiswick | Complete | Complete | Yes | 4.210 | 2023-11-21 00:00:00+00:00 | 1998 | 4209.703771 | LINESTRING Z (521097.743 178513.647 0, 521108.... |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 160 | C50 | Cycleways | Camden Town to Finsbury Park | Camden Town to York Way | Complete | Complete | Yes | 1.133 | 2023-11-21 00:00:00+00:00 | 2151 | 1132.913927 | LINESTRING Z (529231.783 184130.294 28.3, 5292... |
| 162 | C | Cycleways | Elephant and Castle to Hampstead | Castlehaven Grafton Road | Complete | Complete | Yes | 1.833 | 2023-11-21 00:00:00+00:00 | 2153 | 1832.819766 | LINESTRING Z (527667.882 185680.986 0, 527719.... |
| 163 | C40 | Cycleways | Brentford to Twickenham | Brentford to Twickenham | Complete | Complete | Yes | 4.291 | 2023-11-21 00:00:00+00:00 | 2154 | 4291.196598 | LINESTRING Z (516033.206 173764.74 0, 516048.3... |
| 164 | C4 | Cycleways | London Bridge to Rotherhithe Roundabout | Rotherhithe Roundabout to Lewisham | (SG5) Delivery | In Progress | Yes | 1.335 | 2023-11-21 00:00:00+00:00 | 2155 | 1335.083915 | LINESTRING Z (536022.822 178603.139 0, 535966.... |
| 165 | C39 | Cycleways | Kensington High St to Shepherds Bush | Kensington High St to Shepherds Bush | Complete | Complete | Yes | 1.151 | 2023-11-21 00:00:00+00:00 | 2156 | 1151.249368 | LINESTRING Z (524569.239 179008.338 0, 524509.... |
123 rows × 12 columns
# spatial join between the linestring and CCZ zone.
gdf_ccz_cycle_routes_intersect_linestring = gpd.sjoin(gdf_cycle_routes_s, gdf_ccz, how='inner', predicate='intersects')
# We can observe that there are 17 cycle routes that are intersected with the CCZ zone.
gdf_ccz_cycle_routes_intersect_linestring
| LABEL | PROGRAMME | ROUTE_NAME | ROUTE | MILESTONE | STATUS | PUBLIC_ | ROUTE_LENGTH_KM | PROGRAMME_UPDATED | OBJECTID_left | Shape__Length | geometry | index_right | OBJECTID_right | BOUNDARY | Shape_Area | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | C38 | Cycleways | Finsbury Park to Highbury Fields | Islington to Finsbury | (SG2) Option Selection | Feasibility | Yes | 0.566 | 2023-11-21 00:00:00+00:00 | 1991 | 565.899062 | LINESTRING Z (531184.797 182813.267 0, 531180.... | 0 | 1 | CSS Area | 21.375571 |
| 4 | C41 | Cycleways | Euston to Holborn | Euston to Holborn | Complete | Complete | Yes | 1.181 | 2023-11-21 00:00:00+00:00 | 1995 | 1180.878708 | LINESTRING Z (530257.989 182468.202 0, 530288.... | 0 | 1 | CSS Area | 21.375571 |
| 22 | C3 | Cycleways | Lancaster Gate to Barking | Lancaster Gate to Barking | Complete | Complete | Yes | 22.425 | 2023-11-21 00:00:00+00:00 | 2013 | 22425.196420 | LINESTRING Z (545215.501 183267.379 0, 545183.... | 0 | 1 | CSS Area | 21.375571 |
| 26 | C | Cycle Superhighways | Lancaster Gate to Barking | Horse Guards Road | Complete | Complete | Yes | 1.336 | 2023-11-21 00:00:00+00:00 | 2017 | 1336.495364 | LINESTRING Z (529903.742 179698.188 0, 529903.... | 0 | 1 | CSS Area | 21.375571 |
| 35 | C5 | Cycleways | Waterloo to Clapham | Waterloo to Clapham | Complete | Complete | Yes | 2.076 | 2023-11-21 00:00:00+00:00 | 2026 | 2076.093601 | LINESTRING Z (530519.358 178020.846 0, 530521.... | 0 | 1 | CSS Area | 21.375571 |
| 40 | C | Cycleways | Lambeth Roundabout to P.Square | Lambeth Roundabout to P.Square | (SG3) Concept Design | Feasibility | Yes | 0.598 | 2023-11-21 00:00:00+00:00 | 2031 | 598.403380 | LINESTRING Z (530155.358 179582.367 0, 530160.... | 0 | 1 | CSS Area | 21.375571 |
| 42 | C11 | Cycleways | Essex Road to Farringdon | The City to Farringdon | Complete | Complete | Yes | 1.243 | 2023-11-21 00:00:00+00:00 | 2033 | 1243.120260 | LINESTRING Z (532591.297 181938.706 0, 532574.... | 0 | 1 | CSS Area | 21.375571 |
| 50 | C | Cycleways | C1 to Liverpool Street | C1 to Liverpool Street | Complete | Complete | Yes | 0.525 | 2023-11-21 00:00:00+00:00 | 2041 | 524.539812 | LINESTRING Z (532944.816 181895.234 0, 533036.... | 0 | 1 | CSS Area | 21.375571 |
| 73 | C | Cycleways | Lambeth Roundabout - North & South | Lambeth Roundabout - North & South | (SG4) Detailed Design | Feasibility | Yes | 0.519 | 2023-11-21 00:00:00+00:00 | 2064 | 519.098639 | LINESTRING Z (530531.437 178939.038 0, 530531.... | 0 | 1 | CSS Area | 21.375571 |
| 83 | C14 | Cycleways | Blackfriars to Rotherhithe | Blackfriars to Rotherhithe | Complete | Complete | Yes | 6.995 | 2023-11-21 00:00:00+00:00 | 2074 | 6994.926446 | LINESTRING Z (536853.163 178583.076 0, 536808.... | 0 | 1 | CSS Area | 21.375571 |
| 103 | C | Central London Grid | Fitzrovia to Soho | Fitzrovia to Soho | (SG4) Detailed Design | Feasibility | Yes | 0.942 | 2023-11-21 00:00:00+00:00 | 2094 | 942.260282 | LINESTRING Z (529477.755 181246.222 25.5, 5295... | 0 | 1 | CSS Area | 21.375571 |
| 110 | C | Central London Grid | Fitzrovia to Soho | Fitzrovia to Soho | (SG4) Detailed Design | Feasibility | Yes | 0.365 | 2023-11-21 00:00:00+00:00 | 2101 | 365.025555 | LINESTRING Z (529360.759 181642.231 28.4, 5293... | 0 | 1 | CSS Area | 21.375571 |
| 117 | C | Cycleways | C4 to C14 and C10 | C4 to C14 and C10 | (SG4) Detailed Design | Feasibility | Yes | 0.377 | 2023-11-21 00:00:00+00:00 | 2108 | 377.333957 | LINESTRING Z (533001.786 179343.19 0, 532949.3... | 0 | 1 | CSS Area | 21.375571 |
| 118 | C | Cycleways | Waterloo Bridge | Waterloo Bridge | Complete | Complete | Yes | 0.491 | 2023-11-21 00:00:00+00:00 | 2109 | 490.560628 | LINESTRING Z (530690.767 180675.212 0, 530941.... | 0 | 1 | CSS Area | 21.375571 |
| 120 | C | Cycleways | Oval to C5 | Oval to C5 | Complete | Complete | Yes | 1.032 | 2023-11-21 00:00:00+00:00 | 2111 | 1032.298014 | LINESTRING Z (530701.744 178648.168 0, 530746.... | 0 | 1 | CSS Area | 21.375571 |
| 138 | C | Cycleways | Old Paradise Street | Old Paradise Street | Complete | Complete | Yes | 0.317 | 2023-11-21 00:00:00+00:00 | 2129 | 316.683268 | LINESTRING Z (530822.748 178867.173 0, 530814.... | 0 | 1 | CSS Area | 21.375571 |
| 139 | C11 | Cycleways | Essex Road to Farringdon | Essex Road to The City | Complete | Complete | Yes | 2.123 | 2023-11-21 00:00:00+00:00 | 2130 | 2123.492454 | LINESTRING Z (531860.953 183765.955 0, 532108.... | 0 | 1 | CSS Area | 21.375571 |
# We can plot the cycle routes and the CCZ zone.
fig, ax = plt.subplots(figsize=(6, 6))
gdf_ccz.plot(ax=ax, edgecolor='black', color='#fff2f2', alpha=1, linewidth=0.6)
gdf_ccz_cycle_routes_intersect_linestring.plot(ax=ax, edgecolor='grey', color='royalblue', alpha=1, linewidth=2)
ax.set_axis_off()
Then, we can also perform a spatial join between the London cycle routes (single linestrings) and the CCZ to find out which cycle routes are crosses the CCZ.
# Perform a spatial join between London cycle routes and CCZ.
gdf_ccz_cycle_routes_cross = gpd.sjoin(gdf_cycle_routes_s, gdf_ccz, how='inner', predicate='crosses')
# We can observe that there are 6 cycle routes that are crossed with the CCZ.
gdf_ccz_cycle_routes_cross
| LABEL | PROGRAMME | ROUTE_NAME | ROUTE | MILESTONE | STATUS | PUBLIC_ | ROUTE_LENGTH_KM | PROGRAMME_UPDATED | OBJECTID_left | Shape__Length | geometry | index_right | OBJECTID_right | BOUNDARY | Shape_Area | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | C38 | Cycleways | Finsbury Park to Highbury Fields | Islington to Finsbury | (SG2) Option Selection | Feasibility | Yes | 0.566 | 2023-11-21 00:00:00+00:00 | 1991 | 565.899062 | LINESTRING Z (531184.797 182813.267 0, 531180.... | 0 | 1 | CSS Area | 21.375571 |
| 22 | C3 | Cycleways | Lancaster Gate to Barking | Lancaster Gate to Barking | Complete | Complete | Yes | 22.425 | 2023-11-21 00:00:00+00:00 | 2013 | 22425.196420 | LINESTRING Z (545215.501 183267.379 0, 545183.... | 0 | 1 | CSS Area | 21.375571 |
| 35 | C5 | Cycleways | Waterloo to Clapham | Waterloo to Clapham | Complete | Complete | Yes | 2.076 | 2023-11-21 00:00:00+00:00 | 2026 | 2076.093601 | LINESTRING Z (530519.358 178020.846 0, 530521.... | 0 | 1 | CSS Area | 21.375571 |
| 83 | C14 | Cycleways | Blackfriars to Rotherhithe | Blackfriars to Rotherhithe | Complete | Complete | Yes | 6.995 | 2023-11-21 00:00:00+00:00 | 2074 | 6994.926446 | LINESTRING Z (536853.163 178583.076 0, 536808.... | 0 | 1 | CSS Area | 21.375571 |
| 120 | C | Cycleways | Oval to C5 | Oval to C5 | Complete | Complete | Yes | 1.032 | 2023-11-21 00:00:00+00:00 | 2111 | 1032.298014 | LINESTRING Z (530701.744 178648.168 0, 530746.... | 0 | 1 | CSS Area | 21.375571 |
| 139 | C11 | Cycleways | Essex Road to Farringdon | Essex Road to The City | Complete | Complete | Yes | 2.123 | 2023-11-21 00:00:00+00:00 | 2130 | 2123.492454 | LINESTRING Z (531860.953 183765.955 0, 532108.... | 0 | 1 | CSS Area | 21.375571 |
# We can plot the cycle routes and the CCZ zone.
fig, ax = plt.subplots(figsize=(6, 6))
gdf_ccz.plot(ax=ax, edgecolor='black', color='#fff2f2', alpha=1, linewidth=0.6)
gdf_ccz_cycle_routes_cross.plot(ax=ax, edgecolor='grey', color='royalblue', alpha=1, linewidth=2)
ax.set_axis_off()
1.5 An example of spatial and temporal data processing and integration#
Task description:
To build a crime prediction model based on the house price and population data (X); the case study is Kingston upon Hull, city of; geospatial unit of analysis is LSOA level; the temporal unit of analysis is monthly
Data sources:
All data were from ONS and OPEN POLICE DATA UK
UK LSOAs data:
data/Lower_layer_Super_Output_Areas_(December_2021)_Boundaries_EW_BFC_(V10).geojsonHumberside Police:
data/police_humberside_2024UK LSOA House Price:
data/Mean house prices by lower layer super output area- HPSSA dataset 47.csvUK LSOA population:
data/Lower layer Super Output Area population estimates 2019-2022.csv
Note: Ensure that the geographic unit identifiers (e.g., LSOA codes or names) in both the geo file and the CSV file (or other formats) refer to the same version. Geographic boundaries index can change over time — for example, LSOAs have 2011 and 2022 versions — so always verify they match before proceeding with data processing.
1 Read uk LSOA geojson
gdf_lsoa_uk = gpd.read_file('data/Lower_layer_Super_Output_Areas_(December_2021)_Boundaries_EW_BFC_(V10).geojson')
gdf_lsoa_uk.head()
| FID | LSOA21CD | LSOA21NM | LSOA21NMW | BNG_E | BNG_N | LAT | LONG | Shape__Area | Shape__Length | GlobalID | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | E01000001 | City of London 001A | 532123 | 181632 | 51.51817 | -0.097150 | 129865.314476 | 2635.767993 | c625aea8-6d73-4b2a-be76-4d5c44cad9f8 | POLYGON ((-0.09665 51.52028, -0.09663 51.52025... | |
| 1 | 2 | E01000002 | City of London 001B | 532480 | 181715 | 51.51883 | -0.091970 | 228419.782242 | 2707.816821 | 52c878e9-ac68-4886-b4a8-fea9cd241a70 | POLYGON ((-0.08967 51.52069, -0.08971 51.52058... | |
| 2 | 3 | E01000003 | City of London 001C | 532239 | 182033 | 51.52174 | -0.095330 | 59054.204697 | 1224.573160 | b9d8faca-d489-478d-8ce6-acaf76186d7d | POLYGON ((-0.0965 51.52295, -0.09644 51.52282,... | |
| 3 | 4 | E01000005 | City of London 001E | 533581 | 181283 | 51.51469 | -0.076280 | 189577.709503 | 2275.805344 | 15e1417d-537c-4845-9820-fc7596bd59b0 | POLYGON ((-0.07568 51.51575, -0.07539 51.51555... | |
| 4 | 5 | E01000006 | Barking and Dagenham 016A | 544994 | 184274 | 51.53875 | 0.089317 | 146536.995750 | 1966.092607 | 8a6c4ee0-c0ff-4736-9cfa-fb12a6d50da0 | POLYGON ((0.09125 51.53905, 0.0915 51.5389, 0.... |
2 Select the LSOAs in Kingston upon Hull
gdf_lsoa_hull = gdf_lsoa_uk[gdf_lsoa_uk['LSOA21NM'].str.contains('Kingston upon Hull')]
# transfer the crs
gdf_lsoa_hull = gdf_lsoa_hull.to_crs(epsg=27700)
print(len(gdf_lsoa_hull))
gdf_lsoa_hull.plot()
168
<Axes: >
gdf_lsoa_hull
| FID | LSOA21CD | LSOA21NM | LSOA21NMW | BNG_E | BNG_N | LAT | LONG | Shape__Area | Shape__Length | GlobalID | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 12115 | 12116 | E01012756 | Kingston upon Hull 025A | 507367 | 430316 | 53.75817 | -0.37294 | 198939.277603 | 3497.174098 | b1f95abf-436c-4f5a-b6e1-93be9e0eb24d | POLYGON ((507432.57 430449.315, 507438.172 430... | |
| 12116 | 12117 | E01012757 | Kingston upon Hull 025B | 508017 | 429519 | 53.75088 | -0.36336 | 318088.364120 | 3716.172876 | 5f00b0d5-45c1-4529-9e08-881c4c58834d | POLYGON ((508038.876 430023.773, 508047.19 429... | |
| 12117 | 12118 | E01012758 | Kingston upon Hull 018A | 507706 | 430320 | 53.75814 | -0.36780 | 311920.300018 | 3772.447527 | 94cd85a6-b982-4038-9625-333de4b818df | POLYGON ((507790.751 430455.62, 507795.274 430... | |
| 12118 | 12119 | E01012759 | Kingston upon Hull 025C | 507184 | 429662 | 53.75233 | -0.37594 | 398184.936035 | 3984.903843 | 1516f69d-ec4e-4e2b-8f3d-81d5f680f631 | POLYGON ((507087.666 429995.28, 507087.776 429... | |
| 12119 | 12120 | E01012760 | Kingston upon Hull 025D | 507714 | 429795 | 53.75342 | -0.36786 | 126002.444229 | 2081.849577 | 2e8144b1-63d2-4100-b2a5-a1fe09759513 | POLYGON ((507913.4 429953.245, 507917.247 4299... | |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 33462 | 33463 | E01035468 | Kingston upon Hull 031I | 507220 | 427868 | 53.73621 | -0.37602 | 194811.873474 | 2342.011663 | 41c0a371-996d-4c62-8888-fb4830b71528 | POLYGON ((507208.747 428145.953, 507211.212 42... | |
| 33463 | 33464 | E01035469 | Kingston upon Hull 034C | 509268 | 435508 | 53.80442 | -0.34228 | 602229.054047 | 4797.372241 | ef8cf4ea-28e0-44ba-bdad-f74de2c628c2 | POLYGON ((509524.189 435505.919, 509535.379 43... | |
| 33464 | 33465 | E01035470 | Kingston upon Hull 035A | 508599 | 435590 | 53.80530 | -0.35241 | 512661.485153 | 3739.146033 | f97f0a63-17c4-461a-84e0-f69e41992515 | POLYGON ((508961.99 435419.217, 508962.578 435... | |
| 33465 | 33466 | E01035471 | Kingston upon Hull 035B | 508633 | 435072 | 53.80064 | -0.35207 | 204771.100311 | 2848.973680 | 1dafa6ed-12b8-45cb-a67f-c559fdccfec6 | POLYGON ((508775.302 435262.868, 508776.423 43... | |
| 33466 | 33467 | E01035472 | Kingston upon Hull 035C | 508142 | 434887 | 53.79908 | -0.35959 | 503061.344757 | 3672.250000 | 25c17302-b93b-4752-a042-f2ceadb41dd8 | POLYGON ((507929.806 435254.935, 507986.426 43... |
168 rows × 12 columns
# reindex
gdf_lsoa_hull.index = range(len(gdf_lsoa_hull))
gdf_lsoa_hull
| FID | LSOA21CD | LSOA21NM | LSOA21NMW | BNG_E | BNG_N | LAT | LONG | Shape__Area | Shape__Length | GlobalID | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 12116 | E01012756 | Kingston upon Hull 025A | 507367 | 430316 | 53.75817 | -0.37294 | 198939.277603 | 3497.174098 | b1f95abf-436c-4f5a-b6e1-93be9e0eb24d | POLYGON ((507432.57 430449.315, 507438.172 430... | |
| 1 | 12117 | E01012757 | Kingston upon Hull 025B | 508017 | 429519 | 53.75088 | -0.36336 | 318088.364120 | 3716.172876 | 5f00b0d5-45c1-4529-9e08-881c4c58834d | POLYGON ((508038.876 430023.773, 508047.19 429... | |
| 2 | 12118 | E01012758 | Kingston upon Hull 018A | 507706 | 430320 | 53.75814 | -0.36780 | 311920.300018 | 3772.447527 | 94cd85a6-b982-4038-9625-333de4b818df | POLYGON ((507790.751 430455.62, 507795.274 430... | |
| 3 | 12119 | E01012759 | Kingston upon Hull 025C | 507184 | 429662 | 53.75233 | -0.37594 | 398184.936035 | 3984.903843 | 1516f69d-ec4e-4e2b-8f3d-81d5f680f631 | POLYGON ((507087.666 429995.28, 507087.776 429... | |
| 4 | 12120 | E01012760 | Kingston upon Hull 025D | 507714 | 429795 | 53.75342 | -0.36786 | 126002.444229 | 2081.849577 | 2e8144b1-63d2-4100-b2a5-a1fe09759513 | POLYGON ((507913.4 429953.245, 507917.247 4299... | |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 163 | 33463 | E01035468 | Kingston upon Hull 031I | 507220 | 427868 | 53.73621 | -0.37602 | 194811.873474 | 2342.011663 | 41c0a371-996d-4c62-8888-fb4830b71528 | POLYGON ((507208.747 428145.953, 507211.212 42... | |
| 164 | 33464 | E01035469 | Kingston upon Hull 034C | 509268 | 435508 | 53.80442 | -0.34228 | 602229.054047 | 4797.372241 | ef8cf4ea-28e0-44ba-bdad-f74de2c628c2 | POLYGON ((509524.189 435505.919, 509535.379 43... | |
| 165 | 33465 | E01035470 | Kingston upon Hull 035A | 508599 | 435590 | 53.80530 | -0.35241 | 512661.485153 | 3739.146033 | f97f0a63-17c4-461a-84e0-f69e41992515 | POLYGON ((508961.99 435419.217, 508962.578 435... | |
| 166 | 33466 | E01035471 | Kingston upon Hull 035B | 508633 | 435072 | 53.80064 | -0.35207 | 204771.100311 | 2848.973680 | 1dafa6ed-12b8-45cb-a67f-c559fdccfec6 | POLYGON ((508775.302 435262.868, 508776.423 43... | |
| 167 | 33467 | E01035472 | Kingston upon Hull 035C | 508142 | 434887 | 53.79908 | -0.35959 | 503061.344757 | 3672.250000 | 25c17302-b93b-4752-a042-f2ceadb41dd8 | POLYGON ((507929.806 435254.935, 507986.426 43... |
168 rows × 12 columns
3 Read Police data
import os
import glob
csv_files = glob.glob('data/police_humberside_2024/**/*.csv', recursive=True)
print(csv_files)
['data/police_humberside_2024/2024-09/2024-09-humberside-street.csv', 'data/police_humberside_2024/2024-07/2024-07-humberside-street.csv', 'data/police_humberside_2024/2024-06/2024-06-humberside-street.csv', 'data/police_humberside_2024/2024-01/2024-01-humberside-street.csv', 'data/police_humberside_2024/2024-08/2024-08-humberside-street.csv', 'data/police_humberside_2024/2024-12/2024-12-humberside-street.csv', 'data/police_humberside_2024/2024-04/2024-04-humberside-street.csv', 'data/police_humberside_2024/2024-03/2024-03-humberside-street.csv', 'data/police_humberside_2024/2024-02/2024-02-humberside-street.csv', 'data/police_humberside_2024/2024-05/2024-05-humberside-street.csv', 'data/police_humberside_2024/2024-11/2024-11-humberside-street.csv', 'data/police_humberside_2024/2024-10/2024-10-humberside-street.csv']
df_crime_hu_2024 = pd.concat([pd.read_csv(f) for f in csv_files])
df_crime_hu_2024
| Crime ID | Month | Reported by | Falls within | Longitude | Latitude | Location | LSOA code | LSOA name | Crime type | Last outcome category | Context | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 26945ea720972254fe4c2f0e2ccc59c49d7354ef2fbdf1... | 2024-09 | Humberside Police | Humberside Police | -0.949067 | 53.604417 | On or near St Georges Road | E01007641 | Doncaster 003F | Violence and sexual offences | Status update unavailable | NaN |
| 1 | 9bf8204c7cad6f6b5e4290fe564470d06deb29edfcf954... | 2024-09 | Humberside Police | Humberside Police | -1.033626 | 53.642072 | On or near Kirk Lane | E01007625 | Doncaster 004A | Vehicle crime | Investigation complete; no suspect identified | NaN |
| 2 | 0f2144b1195904783aa3159185c77a3d16550a13625c7f... | 2024-09 | Humberside Police | Humberside Police | -1.034449 | 53.600066 | On or near West End | E01007626 | Doncaster 004B | Burglary | Awaiting court outcome | NaN |
| 3 | 41192c91566add280cbf766e49dc4ae3e1d31f89274005... | 2024-09 | Humberside Police | Humberside Police | -1.105560 | 53.511434 | On or near Lake View | E01034240 | Doncaster 027E | Violence and sexual offences | Action to be taken by another organisation | NaN |
| 4 | e505ec90b4dc6aa04bfe3d3fe3fd661f38efe986049f26... | 2024-09 | Humberside Police | Humberside Police | -1.222401 | 53.479055 | On or near Sheldon Avenue | E01007537 | Doncaster 035A | Violence and sexual offences | Status update unavailable | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 8118 | 1ac181333df8d1877ea00ffca5e89aabf4e0bb3ed2d416... | 2024-10 | Humberside Police | Humberside Police | -1.141953 | 53.699111 | On or near High Eggborough Lane | E01027890 | Selby 010B | Drugs | Status update unavailable | NaN |
| 8119 | c9ff9608cf5bdb6d2268b2b774ffaebe398ca0503e4933... | 2024-10 | Humberside Police | Humberside Police | -1.103616 | 53.684144 | On or near Long Lane | E01027924 | Selby 010D | Violence and sexual offences | Status update unavailable | NaN |
| 8120 | eeb33d2af2d411d82ea5247bd8c51892c6aaaeb92af6b2... | 2024-10 | Humberside Police | Humberside Police | -1.103616 | 53.684144 | On or near Long Lane | E01027924 | Selby 010D | Violence and sexual offences | Status update unavailable | NaN |
| 8121 | 224054ca4eb68817bad2bdb0ac0405825468da1409fb33... | 2024-10 | Humberside Police | Humberside Police | -0.752364 | 53.390423 | On or near Riby Close | E01026375 | West Lindsey 006B | Violence and sexual offences | Unable to prosecute suspect | NaN |
| 8122 | 1183a22df9f1921bfdf8564b974c4d3b198dd93426b306... | 2024-10 | Humberside Police | Humberside Police | -1.053831 | 53.977489 | On or near New Lane | E01013409 | York 005C | Violence and sexual offences | Status update unavailable | NaN |
98603 rows × 12 columns
4 Use spatial join to link the crime point and geo LSOA 2021 (we do not use the LSOA code in the df crime as it is 2011 LSOA index version)
# create the gdf from the df_crime_hu_2024 with x and y coordinates
gdf_crime_hu = gpd.GeoDataFrame(df_crime_hu_2024, geometry=gpd.points_from_xy(df_crime_hu_2024['Longitude'],
df_crime_hu_2024['Latitude']), crs='EPSG:4326')
# transform the CRS to EPSG:27700 (British National Grid)
gdf_crime_hu = gdf_crime_hu.to_crs(epsg=27700)
gdf_crime_hu
| Crime ID | Month | Reported by | Falls within | Longitude | Latitude | Location | LSOA code | LSOA name | Crime type | Last outcome category | Context | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 26945ea720972254fe4c2f0e2ccc59c49d7354ef2fbdf1... | 2024-09 | Humberside Police | Humberside Police | -0.949067 | 53.604417 | On or near St Georges Road | E01007641 | Doncaster 003F | Violence and sexual offences | Status update unavailable | NaN | POINT (469638.021 412497.03) |
| 1 | 9bf8204c7cad6f6b5e4290fe564470d06deb29edfcf954... | 2024-09 | Humberside Police | Humberside Police | -1.033626 | 53.642072 | On or near Kirk Lane | E01007625 | Doncaster 004A | Vehicle crime | Investigation complete; no suspect identified | NaN | POINT (463986.014 416606.954) |
| 2 | 0f2144b1195904783aa3159185c77a3d16550a13625c7f... | 2024-09 | Humberside Police | Humberside Police | -1.034449 | 53.600066 | On or near West End | E01007626 | Doncaster 004B | Burglary | Awaiting court outcome | NaN | POINT (463994.999 411932.963) |
| 3 | 41192c91566add280cbf766e49dc4ae3e1d31f89274005... | 2024-09 | Humberside Police | Humberside Police | -1.105560 | 53.511434 | On or near Lake View | E01034240 | Doncaster 027E | Violence and sexual offences | Action to be taken by another organisation | NaN | POINT (459412.986 402011.051) |
| 4 | e505ec90b4dc6aa04bfe3d3fe3fd661f38efe986049f26... | 2024-09 | Humberside Police | Humberside Police | -1.222401 | 53.479055 | On or near Sheldon Avenue | E01007537 | Doncaster 035A | Violence and sexual offences | Status update unavailable | NaN | POINT (451704.012 398317.971) |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 8118 | 1ac181333df8d1877ea00ffca5e89aabf4e0bb3ed2d416... | 2024-10 | Humberside Police | Humberside Police | -1.141953 | 53.699111 | On or near High Eggborough Lane | E01027890 | Selby 010B | Drugs | Status update unavailable | NaN | POINT (456747.97 422860.963) |
| 8119 | c9ff9608cf5bdb6d2268b2b774ffaebe398ca0503e4933... | 2024-10 | Humberside Police | Humberside Police | -1.103616 | 53.684144 | On or near Long Lane | E01027924 | Selby 010D | Violence and sexual offences | Status update unavailable | NaN | POINT (459300.004 421227.048) |
| 8120 | eeb33d2af2d411d82ea5247bd8c51892c6aaaeb92af6b2... | 2024-10 | Humberside Police | Humberside Police | -1.103616 | 53.684144 | On or near Long Lane | E01027924 | Selby 010D | Violence and sexual offences | Status update unavailable | NaN | POINT (459300.004 421227.048) |
| 8121 | 224054ca4eb68817bad2bdb0ac0405825468da1409fb33... | 2024-10 | Humberside Police | Humberside Police | -0.752364 | 53.390423 | On or near Riby Close | E01026375 | West Lindsey 006B | Violence and sexual offences | Unable to prosecute suspect | NaN | POINT (483070.027 388901.029) |
| 8122 | 1183a22df9f1921bfdf8564b974c4d3b198dd93426b306... | 2024-10 | Humberside Police | Humberside Police | -1.053831 | 53.977489 | On or near New Lane | E01013409 | York 005C | Violence and sexual offences | Status update unavailable | NaN | POINT (462153.013 453906.043) |
98603 rows × 13 columns
gdf_crime_hu.plot()
<Axes: >
gdf_crime_hu
| Crime ID | Month | Reported by | Falls within | Longitude | Latitude | Location | LSOA code | LSOA name | Crime type | Last outcome category | Context | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 26945ea720972254fe4c2f0e2ccc59c49d7354ef2fbdf1... | 2024-09 | Humberside Police | Humberside Police | -0.949067 | 53.604417 | On or near St Georges Road | E01007641 | Doncaster 003F | Violence and sexual offences | Status update unavailable | NaN | POINT (469638.021 412497.03) |
| 1 | 9bf8204c7cad6f6b5e4290fe564470d06deb29edfcf954... | 2024-09 | Humberside Police | Humberside Police | -1.033626 | 53.642072 | On or near Kirk Lane | E01007625 | Doncaster 004A | Vehicle crime | Investigation complete; no suspect identified | NaN | POINT (463986.014 416606.954) |
| 2 | 0f2144b1195904783aa3159185c77a3d16550a13625c7f... | 2024-09 | Humberside Police | Humberside Police | -1.034449 | 53.600066 | On or near West End | E01007626 | Doncaster 004B | Burglary | Awaiting court outcome | NaN | POINT (463994.999 411932.963) |
| 3 | 41192c91566add280cbf766e49dc4ae3e1d31f89274005... | 2024-09 | Humberside Police | Humberside Police | -1.105560 | 53.511434 | On or near Lake View | E01034240 | Doncaster 027E | Violence and sexual offences | Action to be taken by another organisation | NaN | POINT (459412.986 402011.051) |
| 4 | e505ec90b4dc6aa04bfe3d3fe3fd661f38efe986049f26... | 2024-09 | Humberside Police | Humberside Police | -1.222401 | 53.479055 | On or near Sheldon Avenue | E01007537 | Doncaster 035A | Violence and sexual offences | Status update unavailable | NaN | POINT (451704.012 398317.971) |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 8118 | 1ac181333df8d1877ea00ffca5e89aabf4e0bb3ed2d416... | 2024-10 | Humberside Police | Humberside Police | -1.141953 | 53.699111 | On or near High Eggborough Lane | E01027890 | Selby 010B | Drugs | Status update unavailable | NaN | POINT (456747.97 422860.963) |
| 8119 | c9ff9608cf5bdb6d2268b2b774ffaebe398ca0503e4933... | 2024-10 | Humberside Police | Humberside Police | -1.103616 | 53.684144 | On or near Long Lane | E01027924 | Selby 010D | Violence and sexual offences | Status update unavailable | NaN | POINT (459300.004 421227.048) |
| 8120 | eeb33d2af2d411d82ea5247bd8c51892c6aaaeb92af6b2... | 2024-10 | Humberside Police | Humberside Police | -1.103616 | 53.684144 | On or near Long Lane | E01027924 | Selby 010D | Violence and sexual offences | Status update unavailable | NaN | POINT (459300.004 421227.048) |
| 8121 | 224054ca4eb68817bad2bdb0ac0405825468da1409fb33... | 2024-10 | Humberside Police | Humberside Police | -0.752364 | 53.390423 | On or near Riby Close | E01026375 | West Lindsey 006B | Violence and sexual offences | Unable to prosecute suspect | NaN | POINT (483070.027 388901.029) |
| 8122 | 1183a22df9f1921bfdf8564b974c4d3b198dd93426b306... | 2024-10 | Humberside Police | Humberside Police | -1.053831 | 53.977489 | On or near New Lane | E01013409 | York 005C | Violence and sexual offences | Status update unavailable | NaN | POINT (462153.013 453906.043) |
98603 rows × 13 columns
# spatial join the gdf_crime_hu and gdf_lsoa_hull to get the LSOA index for each crime incident (we don't use the 'LSOA codes' in the df_crime_hu_2024)
gdf_crime_hull_lsoa = gpd.sjoin(gdf_crime_hu, gdf_lsoa_hull, how='inner', predicate='within')
gdf_crime_hull_lsoa
| Crime ID | Month | Reported by | Falls within | Longitude | Latitude | Location | LSOA code | LSOA name | Crime type | ... | LSOA21CD | LSOA21NM | LSOA21NMW | BNG_E | BNG_N | LAT | LONG | Shape__Area | Shape__Length | GlobalID | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1733 | NaN | 2024-09 | Humberside Police | Humberside Police | -0.325162 | 53.797160 | On or near Abingdon Garth | E01012782 | Kingston upon Hull 002A | Anti-social behaviour | ... | E01012782 | Kingston upon Hull 002A | 510564 | 434676 | 53.79667 | -0.32291 | 295014.267838 | 2717.829437 | 4bac5250-9916-4da1-87ac-00c02f2d47b4 | |
| 1734 | e725eb4dd48c94eb57a79eb058ec04b3e08cf3568b7a13... | 2024-09 | Humberside Police | Humberside Police | -0.321930 | 53.797475 | On or near Cosford Garth | E01012782 | Kingston upon Hull 002A | Public order | ... | E01012782 | Kingston upon Hull 002A | 510564 | 434676 | 53.79667 | -0.32291 | 295014.267838 | 2717.829437 | 4bac5250-9916-4da1-87ac-00c02f2d47b4 | |
| 1735 | 677eb12804c1fcbb51009676b156185860784be90cac2f... | 2024-09 | Humberside Police | Humberside Police | -0.325162 | 53.797160 | On or near Abingdon Garth | E01012782 | Kingston upon Hull 002A | Vehicle crime | ... | E01012782 | Kingston upon Hull 002A | 510564 | 434676 | 53.79667 | -0.32291 | 295014.267838 | 2717.829437 | 4bac5250-9916-4da1-87ac-00c02f2d47b4 | |
| 1736 | 59a88431621edfdd7a00fc76483a5681048e544aeb8039... | 2024-09 | Humberside Police | Humberside Police | -0.325162 | 53.797160 | On or near Abingdon Garth | E01012782 | Kingston upon Hull 002A | Violence and sexual offences | ... | E01012782 | Kingston upon Hull 002A | 510564 | 434676 | 53.79667 | -0.32291 | 295014.267838 | 2717.829437 | 4bac5250-9916-4da1-87ac-00c02f2d47b4 | |
| 1737 | f5495044a3393692da9d87a230fd1e2fc71fc0f957d0d3... | 2024-09 | Humberside Police | Humberside Police | -0.325162 | 53.797160 | On or near Abingdon Garth | E01012782 | Kingston upon Hull 002A | Violence and sexual offences | ... | E01012782 | Kingston upon Hull 002A | 510564 | 434676 | 53.79667 | -0.32291 | 295014.267838 | 2717.829437 | 4bac5250-9916-4da1-87ac-00c02f2d47b4 | |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 4865 | 3ef9fd9a74ab40e880e71733eaa2de6432b1139e97d465... | 2024-10 | Humberside Police | Humberside Police | -0.355095 | 53.793485 | On or near Raich Carter Way | E01033107 | Kingston upon Hull 035E | Shoplifting | ... | E01033107 | Kingston upon Hull 035E | 508618 | 434547 | 53.79592 | -0.35249 | 392191.723434 | 3331.231133 | f698240e-05d9-4a6a-8c12-ba6f537ddef1 | |
| 4866 | 6dad4e91318a8e31e8f2a160afe6e86c67167fc75b20ce... | 2024-10 | Humberside Police | Humberside Police | -0.350433 | 53.795452 | On or near Runnymede Way | E01033107 | Kingston upon Hull 035E | Shoplifting | ... | E01033107 | Kingston upon Hull 035E | 508618 | 434547 | 53.79592 | -0.35249 | 392191.723434 | 3331.231133 | f698240e-05d9-4a6a-8c12-ba6f537ddef1 | |
| 4867 | a099208f43ab24f8795aef51aa0c7f1ea1c38103c01373... | 2024-10 | Humberside Police | Humberside Police | -0.350433 | 53.795452 | On or near Runnymede Way | E01033107 | Kingston upon Hull 035E | Shoplifting | ... | E01033107 | Kingston upon Hull 035E | 508618 | 434547 | 53.79592 | -0.35249 | 392191.723434 | 3331.231133 | f698240e-05d9-4a6a-8c12-ba6f537ddef1 | |
| 4868 | b19c91361ef09463e19f83ec253487ab7dcb78bcdd9e3c... | 2024-10 | Humberside Police | Humberside Police | -0.348154 | 53.798576 | On or near Halecroft Park | E01033107 | Kingston upon Hull 035E | Vehicle crime | ... | E01033107 | Kingston upon Hull 035E | 508618 | 434547 | 53.79592 | -0.35249 | 392191.723434 | 3331.231133 | f698240e-05d9-4a6a-8c12-ba6f537ddef1 | |
| 4869 | 4c464b1fef6838cc3c986b0ad24efaa4a754a3bd0390c7... | 2024-10 | Humberside Police | Humberside Police | -0.353396 | 53.796940 | On or near Knightley Way | E01033107 | Kingston upon Hull 035E | Violence and sexual offences | ... | E01033107 | Kingston upon Hull 035E | 508618 | 434547 | 53.79592 | -0.35249 | 392191.723434 | 3331.231133 | f698240e-05d9-4a6a-8c12-ba6f537ddef1 |
37931 rows × 25 columns
gdf_crime_hull_lsoa.plot()
<Axes: >
5 Aggregation in space and time: Now we have a df crime at LSOA and Month level
df_crime_hull_agg = gdf_crime_hull_lsoa.groupby(['LSOA21CD', 'Month']).agg({'Crime ID': 'count'}).reset_index().rename(columns={'Crime ID': 'Numbers'})
df_crime_hull_agg
| LSOA21CD | Month | Numbers | |
|---|---|---|---|
| 0 | E01012756 | 2024-01 | 3 |
| 1 | E01012756 | 2024-02 | 11 |
| 2 | E01012756 | 2024-03 | 15 |
| 3 | E01012756 | 2024-04 | 10 |
| 4 | E01012756 | 2024-05 | 12 |
| ... | ... | ... | ... |
| 1997 | E01035472 | 2024-07 | 4 |
| 1998 | E01035472 | 2024-09 | 4 |
| 1999 | E01035472 | 2024-10 | 1 |
| 2000 | E01035472 | 2024-11 | 3 |
| 2001 | E01035472 | 2024-12 | 10 |
2002 rows × 3 columns
6 Merge geo LSOA and selected LSOA-month-level crime data, if you need the visualisation using geopandas
# select the 2024-04
gdf_crime_hull_agg_Apr = pd.merge(gdf_lsoa_hull, df_crime_hull_agg[df_crime_hull_agg.Month == '2024-04'], on='LSOA21CD', how='left')
gdf_crime_hull_agg_Apr
| FID | LSOA21CD | LSOA21NM | LSOA21NMW | BNG_E | BNG_N | LAT | LONG | Shape__Area | Shape__Length | GlobalID | geometry | Month | Numbers | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 12116 | E01012756 | Kingston upon Hull 025A | 507367 | 430316 | 53.75817 | -0.37294 | 198939.277603 | 3497.174098 | b1f95abf-436c-4f5a-b6e1-93be9e0eb24d | POLYGON ((507432.57 430449.315, 507438.172 430... | 2024-04 | 10.0 | |
| 1 | 12117 | E01012757 | Kingston upon Hull 025B | 508017 | 429519 | 53.75088 | -0.36336 | 318088.364120 | 3716.172876 | 5f00b0d5-45c1-4529-9e08-881c4c58834d | POLYGON ((508038.876 430023.773, 508047.19 429... | 2024-04 | 24.0 | |
| 2 | 12118 | E01012758 | Kingston upon Hull 018A | 507706 | 430320 | 53.75814 | -0.36780 | 311920.300018 | 3772.447527 | 94cd85a6-b982-4038-9625-333de4b818df | POLYGON ((507790.751 430455.62, 507795.274 430... | 2024-04 | 6.0 | |
| 3 | 12119 | E01012759 | Kingston upon Hull 025C | 507184 | 429662 | 53.75233 | -0.37594 | 398184.936035 | 3984.903843 | 1516f69d-ec4e-4e2b-8f3d-81d5f680f631 | POLYGON ((507087.666 429995.28, 507087.776 429... | 2024-04 | 18.0 | |
| 4 | 12120 | E01012760 | Kingston upon Hull 025D | 507714 | 429795 | 53.75342 | -0.36786 | 126002.444229 | 2081.849577 | 2e8144b1-63d2-4100-b2a5-a1fe09759513 | POLYGON ((507913.4 429953.245, 507917.247 4299... | 2024-04 | 9.0 | |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 163 | 33463 | E01035468 | Kingston upon Hull 031I | 507220 | 427868 | 53.73621 | -0.37602 | 194811.873474 | 2342.011663 | 41c0a371-996d-4c62-8888-fb4830b71528 | POLYGON ((507208.747 428145.953, 507211.212 42... | 2024-04 | 28.0 | |
| 164 | 33464 | E01035469 | Kingston upon Hull 034C | 509268 | 435508 | 53.80442 | -0.34228 | 602229.054047 | 4797.372241 | ef8cf4ea-28e0-44ba-bdad-f74de2c628c2 | POLYGON ((509524.189 435505.919, 509535.379 43... | 2024-04 | 1.0 | |
| 165 | 33465 | E01035470 | Kingston upon Hull 035A | 508599 | 435590 | 53.80530 | -0.35241 | 512661.485153 | 3739.146033 | f97f0a63-17c4-461a-84e0-f69e41992515 | POLYGON ((508961.99 435419.217, 508962.578 435... | 2024-04 | 6.0 | |
| 166 | 33466 | E01035471 | Kingston upon Hull 035B | 508633 | 435072 | 53.80064 | -0.35207 | 204771.100311 | 2848.973680 | 1dafa6ed-12b8-45cb-a67f-c559fdccfec6 | POLYGON ((508775.302 435262.868, 508776.423 43... | NaN | NaN | |
| 167 | 33467 | E01035472 | Kingston upon Hull 035C | 508142 | 434887 | 53.79908 | -0.35959 | 503061.344757 | 3672.250000 | 25c17302-b93b-4752-a042-f2ceadb41dd8 | POLYGON ((507929.806 435254.935, 507986.426 43... | 2024-04 | 9.0 |
168 rows × 14 columns
fig, ax = plt.subplots(figsize=(8, 8))
# plot the crime data
gdf_crime_hull_agg_Apr.plot(ax=ax, column='Numbers', cmap='Reds', edgecolor='black', linewidth=0.05, legend=True)
ax.set_title('Crime Numbers in Kingston upon Hull LSOAs (Apr 2024)')
ax.axis('off')
plt.show()
7 Read house price data
# read the csv file
df_house_price = pd.read_csv('data/Mean house prices by lower layer super output area- HPSSA dataset 47.csv')
df_house_price_hull = df_house_price[df_house_price['Local authority name'].str.contains('Hull')]
df_house_price_hull
| Local authority code | Local authority name | LSOA code | LSOA name | Year ending Dec 1995 | Year ending Mar 1996 | Year ending Jun 1996 | Year ending Sep 1996 | Year ending Dec 1996 | Year ending Mar 1997 | ... | Year ending Sep 2020 | Year ending Dec 2020 | Year ending Mar 2021 | Year ending Jun 2021 | Year ending Sep 2021 | Year ending Dec 2021 | Year ending Mar 2022 | Year ending Jun 2022 | Year ending Sep 2022 | Year ending Dec 2022 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 808 | E06000010 | Kingston upon Hull, City of | E01012756 | Kingston upon Hull 025A | 40,920 | 41,652 | 40,819 | 42,748 | 39,059 | 39,567 | ... | 135,647 | 151,126 | 159,225 | 152,954 | 146,913 | 148,260 | 143,033 | 141,745 | 148,008 | 132,769 |
| 809 | E06000010 | Kingston upon Hull, City of | E01012757 | Kingston upon Hull 025B | 31,376 | 29,795 | 30,367 | 30,543 | 30,956 | 32,781 | ... | 132,178 | 123,194 | 127,827 | 135,008 | 141,380 | 148,043 | 135,320 | 139,061 | 130,330 | 137,602 |
| 810 | E06000010 | Kingston upon Hull, City of | E01012758 | Kingston upon Hull 018A | 49,324 | 46,195 | 43,328 | 51,442 | 60,738 | 64,053 | ... | 224,715 | 238,168 | 230,783 | 239,213 | 249,940 | 238,605 | 246,214 | 240,353 | 252,263 | 275,177 |
| 811 | E06000010 | Kingston upon Hull, City of | E01012759 | Kingston upon Hull 025C | 29,197 | 32,087 | 32,336 | 30,748 | 30,246 | 27,783 | ... | 110,625 | 114,500 | 106,262 | 104,782 | 106,537 | 115,615 | 104,722 | 106,965 | 108,325 | 104,977 |
| 812 | E06000010 | Kingston upon Hull, City of | E01012760 | Kingston upon Hull 025D | 28,461 | 28,470 | 27,965 | 28,415 | 28,411 | 28,692 | ... | 94,397 | 93,843 | 92,871 | 95,963 | 98,633 | 102,472 | 108,855 | 114,807 | 121,153 | 124,452 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 969 | E06000010 | Kingston upon Hull, City of | E01033106 | Kingston upon Hull 001G | : | : | : | : | : | : | ... | 166,842 | 174,737 | 175,191 | 183,083 | 182,017 | 182,152 | 183,913 | 181,918 | 185,428 | 190,369 |
| 970 | E06000010 | Kingston upon Hull, City of | E01033107 | Kingston upon Hull 001H | : | : | : | : | : | : | ... | 173,306 | 190,222 | 176,182 | 173,658 | 187,520 | 192,960 | 200,920 | 207,833 | 213,442 | 210,155 |
| 971 | E06000010 | Kingston upon Hull, City of | E01033108 | Kingston upon Hull 001I | 44,934 | 44,546 | 44,415 | 45,978 | 44,602 | 45,403 | ... | 213,374 | 210,026 | 207,826 | 206,988 | 194,688 | 196,092 | 195,953 | 192,474 | 193,118 | 191,822 |
| 972 | E06000010 | Kingston upon Hull, City of | E01033109 | Kingston upon Hull 029F | 29,343 | 29,512 | 31,809 | 28,661 | 27,863 | 28,073 | ... | 89,854 | 102,372 | 115,406 | 112,238 | 112,042 | 109,554 | 88,679 | 92,357 | 92,115 | 97,469 |
| 973 | E06000010 | Kingston upon Hull, City of | E01033110 | Kingston upon Hull 031G | 22,486 | 21,764 | 21,045 | 20,791 | 19,715 | 20,092 | ... | 141,950 | 139,057 | 135,223 | 139,014 | 136,958 | 136,026 | 136,205 | 136,937 | 131,466 | 138,136 |
166 rows × 113 columns
8 Merge the df crime at lsoa and month level and the house price
df_crime_hull_agg_hp = pd.merge(df_crime_hull_agg, df_house_price_hull[['LSOA code', 'Year ending Dec 2022']], left_on='LSOA21CD', right_on='LSOA code', how='left')
df_crime_hull_agg_hp
| LSOA21CD | Month | Numbers | LSOA code | Year ending Dec 2022 | |
|---|---|---|---|---|---|
| 0 | E01012756 | 2024-01 | 3 | E01012756 | 132,769 |
| 1 | E01012756 | 2024-02 | 11 | E01012756 | 132,769 |
| 2 | E01012756 | 2024-03 | 15 | E01012756 | 132,769 |
| 3 | E01012756 | 2024-04 | 10 | E01012756 | 132,769 |
| 4 | E01012756 | 2024-05 | 12 | E01012756 | 132,769 |
| ... | ... | ... | ... | ... | ... |
| 1997 | E01035472 | 2024-07 | 4 | NaN | NaN |
| 1998 | E01035472 | 2024-09 | 4 | NaN | NaN |
| 1999 | E01035472 | 2024-10 | 1 | NaN | NaN |
| 2000 | E01035472 | 2024-11 | 3 | NaN | NaN |
| 2001 | E01035472 | 2024-12 | 10 | NaN | NaN |
2002 rows × 5 columns
df_crime_hull_agg_hp['Year ending Dec 2022'].values[0]
'132,769'
# we need to do data cleaning for the column 'Year ending Dec 2022'
# the column 'Year ending Dec 2022' is a string, we need to convert it to float
df_crime_hull_agg_hp['Year ending Dec 2022'] = df_crime_hull_agg_hp['Year ending Dec 2022'].fillna('')
df_crime_hull_agg_hp['Year ending Dec 2022'] = df_crime_hull_agg_hp['Year ending Dec 2022'].replace({':': '', ',': ''}, regex=True)
df_crime_hull_agg_hp['Year ending Dec 2022'] = df_crime_hull_agg_hp['Year ending Dec 2022'].replace('', 0)
df_crime_hull_agg_hp['Year ending Dec 2022'] = df_crime_hull_agg_hp['Year ending Dec 2022'].astype(float)
df_crime_hull_agg_hp
| LSOA21CD | Month | Numbers | LSOA code | Year ending Dec 2022 | |
|---|---|---|---|---|---|
| 0 | E01012756 | 2024-01 | 3 | E01012756 | 132769.0 |
| 1 | E01012756 | 2024-02 | 11 | E01012756 | 132769.0 |
| 2 | E01012756 | 2024-03 | 15 | E01012756 | 132769.0 |
| 3 | E01012756 | 2024-04 | 10 | E01012756 | 132769.0 |
| 4 | E01012756 | 2024-05 | 12 | E01012756 | 132769.0 |
| ... | ... | ... | ... | ... | ... |
| 1997 | E01035472 | 2024-07 | 4 | NaN | 0.0 |
| 1998 | E01035472 | 2024-09 | 4 | NaN | 0.0 |
| 1999 | E01035472 | 2024-10 | 1 | NaN | 0.0 |
| 2000 | E01035472 | 2024-11 | 3 | NaN | 0.0 |
| 2001 | E01035472 | 2024-12 | 10 | NaN | 0.0 |
2002 rows × 5 columns
df_crime_hull_agg_hp = df_crime_hull_agg_hp.rename(columns={'Year ending Dec 2022': 'House Price 2022'})
df_crime_hull_agg_hp
| LSOA21CD | Month | Numbers | LSOA code | House Price 2022 | |
|---|---|---|---|---|---|
| 0 | E01012756 | 2024-01 | 3 | E01012756 | 132769.0 |
| 1 | E01012756 | 2024-02 | 11 | E01012756 | 132769.0 |
| 2 | E01012756 | 2024-03 | 15 | E01012756 | 132769.0 |
| 3 | E01012756 | 2024-04 | 10 | E01012756 | 132769.0 |
| 4 | E01012756 | 2024-05 | 12 | E01012756 | 132769.0 |
| ... | ... | ... | ... | ... | ... |
| 1997 | E01035472 | 2024-07 | 4 | NaN | 0.0 |
| 1998 | E01035472 | 2024-09 | 4 | NaN | 0.0 |
| 1999 | E01035472 | 2024-10 | 1 | NaN | 0.0 |
| 2000 | E01035472 | 2024-11 | 3 | NaN | 0.0 |
| 2001 | E01035472 | 2024-12 | 10 | NaN | 0.0 |
2002 rows × 5 columns
9 Read population data and merge to crime data
# read the population data
df_population = pd.read_csv('data/Lower layer Super Output Area population estimates 2019-2022.csv')
df_population.head()
| LAD 2021 Code | LAD 2021 Name | LSOA 2021 Code | LSOA 2021 Name | Total | F0 | F1 | F2 | F3 | F4 | ... | M81 | M82 | M83 | M84 | M85 | M86 | M87 | M88 | M89 | M90 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | E06000001 | Hartlepool | E01011949 | Hartlepool 009A | 1,870 | 15 | 3 | 11 | 13 | 6 | ... | 6 | 6 | 3 | 3 | 3 | 3 | 2 | 1 | 3 | 1 |
| 1 | E06000001 | Hartlepool | E01011950 | Hartlepool 008A | 1,097 | 6 | 5 | 8 | 8 | 5 | ... | 1 | 1 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | E06000001 | Hartlepool | E01011951 | Hartlepool 007A | 1,241 | 8 | 7 | 5 | 8 | 3 | ... | 1 | 1 | 3 | 1 | 2 | 2 | 1 | 0 | 1 | 0 |
| 3 | E06000001 | Hartlepool | E01011952 | Hartlepool 002A | 1,615 | 13 | 11 | 17 | 15 | 15 | ... | 0 | 3 | 1 | 2 | 3 | 3 | 4 | 3 | 0 | 13 |
| 4 | E06000001 | Hartlepool | E01011953 | Hartlepool 002B | 1,982 | 9 | 12 | 18 | 11 | 13 | ... | 3 | 4 | 3 | 2 | 0 | 1 | 0 | 0 | 1 | 4 |
5 rows × 187 columns
df_population_hull = df_population[df_population['LAD 2021 Name'].str.contains('Kingston upon Hull')]
df_population_hull.head()
| LAD 2021 Code | LAD 2021 Name | LSOA 2021 Code | LSOA 2021 Name | Total | F0 | F1 | F2 | F3 | F4 | ... | M81 | M82 | M83 | M84 | M85 | M86 | M87 | M88 | M89 | M90 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 822 | E06000010 | Kingston upon Hull | E01012756 | Kingston upon Hull 025A | 1,477 | 7 | 2 | 4 | 9 | 6 | ... | 5 | 4 | 7 | 5 | 0 | 0 | 0 | 0 | 0 | 0 |
| 823 | E06000010 | Kingston upon Hull | E01012757 | Kingston upon Hull 025B | 1,388 | 4 | 8 | 3 | 6 | 3 | ... | 2 | 1 | 2 | 1 | 2 | 2 | 2 | 4 | 1 | 1 |
| 824 | E06000010 | Kingston upon Hull | E01012758 | Kingston upon Hull 018A | 1,538 | 3 | 5 | 5 | 5 | 4 | ... | 1 | 0 | 2 | 1 | 0 | 2 | 1 | 0 | 0 | 0 |
| 825 | E06000010 | Kingston upon Hull | E01012759 | Kingston upon Hull 025C | 1,670 | 21 | 8 | 18 | 7 | 16 | ... | 1 | 0 | 1 | 3 | 0 | 1 | 1 | 1 | 0 | 2 |
| 826 | E06000010 | Kingston upon Hull | E01012760 | Kingston upon Hull 025D | 1,435 | 10 | 3 | 3 | 10 | 4 | ... | 2 | 1 | 3 | 1 | 0 | 0 | 1 | 0 | 0 | 0 |
5 rows × 187 columns
df_population_hull = df_population_hull[['LSOA 2021 Code', 'Total']]
# merge the population data with the gdf_crime_hull_agg_hp
df_crime_hull_agg_hp_pop = pd.merge(df_crime_hull_agg_hp, df_population_hull, left_on='LSOA21CD', right_on='LSOA 2021 Code', how='left')
df_crime_hull_agg_hp_pop
| LSOA21CD | Month | Numbers | LSOA code | House Price 2022 | LSOA 2021 Code | Total | |
|---|---|---|---|---|---|---|---|
| 0 | E01012756 | 2024-01 | 3 | E01012756 | 132769.0 | E01012756 | 1,477 |
| 1 | E01012756 | 2024-02 | 11 | E01012756 | 132769.0 | E01012756 | 1,477 |
| 2 | E01012756 | 2024-03 | 15 | E01012756 | 132769.0 | E01012756 | 1,477 |
| 3 | E01012756 | 2024-04 | 10 | E01012756 | 132769.0 | E01012756 | 1,477 |
| 4 | E01012756 | 2024-05 | 12 | E01012756 | 132769.0 | E01012756 | 1,477 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 1997 | E01035472 | 2024-07 | 4 | NaN | 0.0 | E01035472 | 1,768 |
| 1998 | E01035472 | 2024-09 | 4 | NaN | 0.0 | E01035472 | 1,768 |
| 1999 | E01035472 | 2024-10 | 1 | NaN | 0.0 | E01035472 | 1,768 |
| 2000 | E01035472 | 2024-11 | 3 | NaN | 0.0 | E01035472 | 1,768 |
| 2001 | E01035472 | 2024-12 | 10 | NaN | 0.0 | E01035472 | 1,768 |
2002 rows × 7 columns
df_crime_hull_agg_hp_pop.Total.values
array(['1,477', '1,477', '1,477', ..., '1,768', '1,768', '1,768'],
shape=(2002,), dtype=object)
# we need to rename the column 'Total' to 'Population' and do data cleaning
df_crime_hull_agg_hp_pop['Total'] = df_crime_hull_agg_hp_pop['Total'].replace({ ',': ''}, regex=True).astype(float)
df_crime_hull_agg_hp_pop = df_crime_hull_agg_hp_pop.rename(columns={'Total': 'Population'})
df_crime_hull_agg_hp_pop
| LSOA21CD | Month | Numbers | LSOA code | House Price 2022 | LSOA 2021 Code | Population | |
|---|---|---|---|---|---|---|---|
| 0 | E01012756 | 2024-01 | 3 | E01012756 | 132769.0 | E01012756 | 1477.0 |
| 1 | E01012756 | 2024-02 | 11 | E01012756 | 132769.0 | E01012756 | 1477.0 |
| 2 | E01012756 | 2024-03 | 15 | E01012756 | 132769.0 | E01012756 | 1477.0 |
| 3 | E01012756 | 2024-04 | 10 | E01012756 | 132769.0 | E01012756 | 1477.0 |
| 4 | E01012756 | 2024-05 | 12 | E01012756 | 132769.0 | E01012756 | 1477.0 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 1997 | E01035472 | 2024-07 | 4 | NaN | 0.0 | E01035472 | 1768.0 |
| 1998 | E01035472 | 2024-09 | 4 | NaN | 0.0 | E01035472 | 1768.0 |
| 1999 | E01035472 | 2024-10 | 1 | NaN | 0.0 | E01035472 | 1768.0 |
| 2000 | E01035472 | 2024-11 | 3 | NaN | 0.0 | E01035472 | 1768.0 |
| 2001 | E01035472 | 2024-12 | 10 | NaN | 0.0 | E01035472 | 1768.0 |
2002 rows × 7 columns