Preliminaries: package installation, mounting to Gdrive

In [ ]:
!pip install pandas fiona shapely pyproj rtree folium matplotlib mapclassify
!pip install geopandas
!pip install openturns

from google.colab import drive
drive.mount('/content/drive')
In [5]:
import geopandas as gpd
import pandas as pd

Estimating Probabilities of Extreme Rainfalls in the Philippines

In analyzing extreme rainfalls, we can analyze the historical maximum rainfall that occured.

In [7]:
# other preliminaries to mapping
import matplotlib.pyplot as plt
# provincial shape file for mapping
province_loc = "drive/MyDrive/Colab Notebooks/Data/provinces/provinces.geojson"
province = gpd.read_file(province_loc)

# reading historical rainfall maxima per station
stations = pd.read_csv("drive/MyDrive/Colab Notebooks/Data/NOAA PH Weather Stations.csv")
stations.head()
Out[7]:
Region Administrative Province Station Name LAT LON ALT Rx1day (mm) Period Covered
0 I llocos Norte Laoag 18.178 120.532 7.6 564.2 1907-2018
1 I llocos Sur Sinait-Vigan 17.883 120.467 570.0 594.1 1903-2018
2 I Pangasinan Dagupan 16.082 120.350 20.0 722.6 1903-2018
3 CAR Benguet Baguio 16.400 120.600 15010.0 1085.8 1902-2018
4 II Batanes Basco Radar 20.433 121.967 1670.0 616.4 1903-2018
In [8]:
# plotting historical maximum 1-day rainfall (Rx1day) per station

fig, ax1 = plt.subplots(figsize=(10, 10))

# basemap
province.plot(ax=ax1, alpha = 0.1)

# layer
plot_stations = gpd.GeoDataFrame(stations, geometry=gpd.points_from_xy(stations.LON, stations.LAT), crs = 4326)
plot_stations.plot(ax = ax1,column = 'Rx1day (mm)', legend = True)
plt.title("Historical Single Day Rainfall Amount (mm) Recorded by PH Weather Stations")
Out[8]:
Text(0.5, 1.0, 'Historical Single Day Rainfall Amount (mm) Recorded by PH Weather Stations')

Data Sources:

An initial analysis for this is that north and eastern part of the country experienced the highest rainfall magnitude compared to the other parts of the country.

However, this does not answer the question "how frequent are these extreme events?"

Return Period and Return Levels

Obtaining probabilities is very important for risk assessment in the Philippines, especially that we are ranked 1 in the World Risk Index (World Risk Report 2022) and is more insightful if we analyze the probabilities of the extreme events.

In probabilistic risk assessment in hydrology and flood modelling, the concept of "return periods" or "recurrence intervals" is commonly used.

For example, if a 400 mm of rainfall is a 50-year return period event, then it has a 1/50 = 2% chance of being exceeded within a year. Another intuitive interpretation is that a 400 mm of rainfall is expected to happen once every 50 years.

Another example: how can we assure that a dam to be constructed can last at least 100 years if we do not have 100 years worth of data? Using current data, we will extrapolate and estimate the single day rainfall magnitude that will exist once in every 100 years.

image.png

With this approach, even if we don't have a historically long data (i.e. observed data does not cover 50-100 years), we can determine the amount of rainfall that is exceeded once every 50 years, once every 100 years, and so on, to assist planning big ticket infrastructures such as dams and bridges.

This is also helpful in estimating more frequent extreme events, such as 5-year, 10-year, and 15-year return levels since these are the events that people encounter much often, which has great utility in disaster risk reduction and mitigation.

Reading Rainfall Data

In [9]:
rainfall_max = pd.read_csv("drive/MyDrive/Colab Notebooks/Data/max.spread.csv")
rainfall_max.head()
Out[9]:
YEAR Laoag Sinait-Vigan Dagupan Baguio Basco Radar Itbayat Aparri Calayan Tuguegarao ... Dipolog Zamboanga Malaybalay Lumbia-El Salvador Davao City General Santos Butuan Hinatuan Surigao Cotabato
0 1981 400.050 115.062 109.982 214.884 NaN NaN 375.500 27.940 260.096 ... 36.068 48.006 NaN 17.018 99.060 7.112 27.940 320.040 NaN NaN
1 1982 66.040 16.002 33.020 208.026 NaN NaN 130.048 NaN 35.052 ... 55.880 91.948 NaN NaN NaN NaN 134.112 75.946 NaN NaN
2 1983 16.002 17.018 23.876 300.990 NaN NaN 99.060 1.016 41.910 ... 70.104 39.878 119.888 45.974 17.018 30.988 107.950 34.036 NaN NaN
3 1984 25.908 NaN 23.876 89.916 NaN NaN 6.096 NaN 20.066 ... 58.928 121.920 119.888 219.964 NaN 114.046 96.012 24.892 13.97 NaN
4 1985 214.122 27.940 93.980 121.920 NaN NaN 94.996 NaN 22.098 ... 46.990 55.118 NaN 28.956 49.022 3.048 77.978 41.910 41.91 NaN

5 rows × 56 columns

Fitting Univariate GEVD per station

We will just fit multiple univariate distributions by estimating the location $\mu$, scale $\sigma$, and shape $\xi$ parameters in each station using Maximum Likelihood Estimation. We will use the openturns package to fit a distribution.

In [10]:
import openturns as ot
import numpy as np

After estimating the distribution, return levels can be computed. As a preliminary, note that the $p^{th}$ quantile of a random variable that follows GEV distribution is

\begin{equation} \hat{x_p}=\left\{ \begin{array}{@{}ll@{}} {\hat\mu+\frac{\hat\sigma(1-(-log(p))^\hat\xi)}{\hat\xi}}, & \text{if}\ \hat\xi \neq 0\ \\ {\hat\mu-\hat\sigma(log(-log(p))}, & \text{if}\ \hat\xi = 0 \end{array}\right. \end{equation}

Then, the T-year return level $x_T$ is calculated accordingly by setting $$ p = 1-\alpha=1-\frac{1}{T} $$

The following function qgevd computes the $p^{th}$ quantile from an estimated distribution.

In [11]:
# return level function
def qgevd(p, location, scale, shape):
  
  if shape !=0:
    quantile = location + (scale*(1-(-np.log(p))**shape))/shape
  elif shape ==0:
    quantile = location - scale*(np.log(-np.log(p)))
  
  return quantile

The following code outputs a dataframe of parameters and T-year rainfall return level per station.

In [12]:
# initialize the parameter dataframe
pars = pd.DataFrame(columns=['station_name','location','scale','shape', 'rl_5', 'rl_10', 'rl_25','rl_50', 'rl_100'])

for i in range(1,len(rainfall_max.columns)):
    # getting specific column containing data from specific station i
    station_data = rainfall_max.iloc[:, i] 
    # removing NaN
    station_data = station_data[np.logical_not(np.isnan(station_data))] 
    # converting to ot.point object
    sample = ot.Sample([[p] for p in station_data]) 
    # fitting a GEVD to the data
    gev = ot.GeneralizedExtremeValueFactory().buildAsGeneralizedExtremeValue(sample) 
    # extracting the parameters
    loc0 = list(gev.getParameter())[0]
    scale0 = list(gev.getParameter())[1]
    shape0 = -list(gev.getParameter())[2] # note the negation. In other literature in the formulation of the pdf, this is used.

    # n year return level calculation
    rl_5 = qgevd(p = 1-1/5, location = loc0, scale = scale0, shape = shape0) 
    rl_10 = qgevd(p = 1-1/10, location = loc0, scale = scale0, shape = shape0) 
    rl_25 = qgevd(p = 1-1/25, location = loc0, scale = scale0, shape = shape0)
    rl_50 = qgevd(p = 1-1/50, location = loc0, scale = scale0, shape = shape0)
    rl_100 = qgevd(p = 1-1/100, location = loc0, scale = scale0, shape = shape0)

    pars_temp = pd.DataFrame([[rainfall_max.columns[i], loc0, scale0, shape0, rl_5, rl_10, rl_25, rl_50, rl_100]], 
                             columns=['station_name','location','scale','shape', 'rl_5', 'rl_10', 'rl_25','rl_50', 'rl_100'], 
                             index = [i-1]) 
    # appending the parameters to the pars dataframe
    pars = pars.append(pars_temp)
In [13]:
# we concatenate the parameter dataframe to the initial data frame containint PAGASA station coordinates
stations_pars = pd.concat([stations, pars], axis=1)
plot_stations_pars = gpd.GeoDataFrame(stations_pars, geometry=gpd.points_from_xy(stations_pars.LON, stations_pars.LAT), crs = 4326)
plot_stations_pars
Out[13]:
Region Administrative Province Station Name LAT LON ALT Rx1day (mm) Period Covered geometry station_name location scale shape rl_5 rl_10 rl_25 rl_50 rl_100
0 I llocos Norte Laoag 18.178 120.532 7.6 564.2 1907-2018 POINT (120.53200 18.17800) Laoag 197.394828 125.637875 0.312968 347.791715 400.339673 451.306790 480.457813 503.694190
1 I llocos Sur Sinait-Vigan 17.883 120.467 570.0 594.1 1903-2018 POINT (120.46700 17.88300) Sinait-Vigan 127.858030 89.567592 0.257335 239.313535 280.863326 323.094247 348.397584 369.369054
2 I Pangasinan Dagupan 16.082 120.350 20.0 722.6 1903-2018 POINT (120.35000 16.08200) Dagupan 128.930322 88.019835 -0.000000 260.954792 327.007283 410.464780 472.378319 533.834698
3 CAR Benguet Baguio 16.400 120.600 15010.0 1085.8 1902-2018 POINT (120.60000 16.40000) Baguio 261.078928 156.915378 0.626219 413.703436 450.430777 477.843988 489.889935 497.598555
4 II Batanes Basco Radar 20.433 121.967 1670.0 616.4 1903-2018 POINT (121.96700 20.43300) Basco Radar 131.843027 48.866264 -0.000000 205.139491 241.810071 288.143447 322.516193 356.635135
5 II Batanes Itbayat 20.800 121.850 1240.0 572.3 1971-2018 POINT (121.85000 20.80000) Itbayat 137.426497 48.461714 -0.000000 210.116159 246.483154 292.432948 326.521131 360.357612
6 II Cagayan Aparri 18.367 121.633 30.0 453.1 1903-2018 POINT (121.63300 18.36700) Aparri 97.583207 79.918297 -0.000000 217.455856 277.428731 353.204618 409.419499 465.219299
7 II Cagayan Calayan 19.267 121.467 130.0 522.2 1919-2018 POINT (121.46700 19.26700) Calayan 49.940185 49.982384 -0.000000 124.910762 162.418909 209.810554 244.968382 279.866611
8 II Cagayan Tuguegarao 17.633 121.750 620.0 349.7 1902-2018 POINT (121.75000 17.63300) Tuguegarao 126.676301 63.258603 -0.000000 221.560410 269.031395 329.011111 373.507490 417.675315
9 III Aurora Baler Radar 15.750 121.633 1780.0 675.6 1949-2018 POINT (121.63300 15.75000) Baler Radar 131.397837 52.175754 -0.000000 209.658337 248.812450 298.283775 334.984430 371.414093
10 III Aurora Casiguran 16.267 122.133 40.0 402.6 1919-2018 POINT (122.13300 16.26700) Casiguran 151.490480 78.675103 0.222742 251.808820 290.736376 331.472276 356.594143 377.926244
11 III Pampanga Clark 15.183 120.550 1550.0 274.5 1997-2018 POINT (120.55000 15.18300) Clark 85.975768 53.426880 -0.000000 166.112881 206.205872 256.863473 294.444175 331.747387
12 III Nueva Ecija Cabanatuan 15.467 120.950 320.0 297.2 1919-2018 POINT (120.95000 15.46700) Cabanatuan 97.643741 46.633972 -0.000000 167.591901 202.587309 246.804099 279.606641 312.166973
13 III Zambales Iba 15.333 119.967 50.0 623.7 1903-2018 POINT (119.96700 15.33300) Iba 146.381234 82.155169 0.188577 253.714586 297.039223 343.702108 373.309613 399.060008
14 III Zambales Cubi Pt., Subic Bay 14.800 120.267 190.0 436.4 1994-2018 POINT (120.26700 14.80000) Cubi Pt., Subic Bay 194.279435 102.320517 0.429147 307.449638 341.937193 372.280758 388.025507 399.594059
15 IV-A Batangas Ambulong 14.083 121.050 110.0 499.2 1919-2018 POINT (121.05000 14.08300) Ambulong 94.940529 74.508585 -0.000000 206.698935 262.612214 333.258791 385.668457 437.691138
16 IV-A Cavite Sangley Point 14.495 120.904 40.0 475.4 1974-2018 POINT (120.90400 14.49500) Sangley Point 128.199478 54.291530 -0.000000 209.633515 250.375364 301.852798 340.041699 377.948619
17 IV-A Quezon Alabat 14.100 122.017 50.0 673.0 1952-2018 POINT (122.01700 14.10000) Alabat 126.029161 59.316632 -0.000000 215.000549 259.513371 315.755440 357.479020 398.894519
18 IV-A Quezon Infanta 14.750 121.650 70.0 339.0 1949-2018 POINT (121.65000 14.75000) Infanta 141.302037 71.408172 0.281082 228.696336 260.388896 291.963483 310.510544 325.628636
19 IV-A Quezon Tayabas 14.017 121.600 1580.0 557.7 1970-2018 POINT (121.60000 14.01700) Tayabas 142.613403 67.691851 -0.000000 244.147117 294.944933 359.128108 406.742854 454.006020
20 IV-A Rizal Tanay 14.583 121.367 6510.0 331.8 2000-2018 POINT (121.36700 14.58300) Tanay 113.645911 58.802038 -0.000000 201.845439 245.972096 301.726244 343.087856 384.144061
21 IV-B Occidental Mindoro San Jose 12.367 121.050 30.0 286.7 1980-2018 POINT (121.05000 12.36700) San Jose 110.260241 61.918066 0.382082 180.952424 203.727354 224.571778 235.823435 244.367988
22 IV-B Oriental Mindoro Calapan 13.417 121.183 410.0 277.4 1919-2018 POINT (121.18300 13.41700) Calapan 115.366531 67.125654 0.259719 198.756274 229.757391 261.203513 280.007513 295.566292
23 IV-B Palawan Coron 12.220 120.200 600.0 317.6 1949-2018 POINT (120.20000 12.22000) Coron 94.872957 41.838021 -0.000000 157.627478 189.023874 228.693302 258.122350 287.334099
24 IV-B Palawan Cuyo 10.850 121.000 40.0 294.4 1902-2018 POINT (121.00000 10.85000) Cuyo 67.996734 42.004008 -0.000000 131.000225 162.521181 202.347992 231.893795 261.221437
25 IV-B Palawan Puerto Princesa 9.733 118.767 150.0 269.3 1949-2018 POINT (118.76700 9.73300) Puerto Princesa 82.703001 41.697029 -0.000000 145.246042 176.536632 216.072376 245.402250 274.515556
26 IV-B Romblon Romblon 12.583 122.283 470.0 385.8 1904-2018 POINT (122.28300 12.58300) Romblon 95.200562 70.724211 -0.000000 201.282634 254.356015 321.414373 371.162094 420.542485
27 NCR Metro Manila NAIA (MIA) 14.509 121.020 22.9 472.4 1947-2018 POINT (121.02000 14.50900) NAIA (MIA) 74.319310 81.336198 -0.000000 196.318726 257.355632 334.475925 391.688164 448.477957
28 NCR Metro Manila Port Area (MCO) 14.583 120.983 130.0 403.1 1865-2018 POINT (120.98300 14.58300) Port Area (MCO) 125.070124 50.868121 -0.000000 201.369252 239.542081 287.773551 323.554411 359.071071
29 NCR Metro Manila Science Garden 14.650 121.050 460.0 455.0 1961-2018 POINT (121.05000 14.65000) Science Garden 120.035477 76.532093 -0.000000 234.829024 292.260799 364.826000 418.659011 472.094527
30 V Albay Legazpi 13.133 123.733 170.0 484.6 1902-2018 POINT (123.73300 13.13300) Legazpi 147.259594 71.512710 -0.000000 254.524366 308.189459 375.995445 426.297800 476.228729
31 V Camarines Norte Daet 14.133 122.983 40.0 507.5 1920-2018 POINT (122.98300 14.13300) Daet 166.261321 83.543521 -0.000000 291.571589 354.264931 433.478136 492.243016 550.573985
32 V Catanduanes Virac Synop 13.600 124.217 400.0 494.2 1908-2018 POINT (124.21700 13.60000) Virac Synop 109.145215 59.403420 -0.000000 198.246780 242.824731 299.149089 340.933716 382.409812
33 V Masbate Masbate 12.367 123.633 60.0 603.5 1907-2018 POINT (123.63300 12.36700) Masbate 95.989159 64.633914 0.229973 177.982995 209.534833 242.351728 262.468337 279.463884
34 VI Capiz Roxas City 11.600 122.750 40.0 370.2 1903-2018 POINT (122.75000 11.60000) Roxas City 111.993938 54.044305 -0.000000 193.057153 233.613477 284.856500 322.871502 360.605808
35 VI lloilo Iloilo 10.713 122.541 8.2 319.8 1903-2010 POINT (122.54100 10.71300) Iloilo 100.375271 50.121733 -0.000000 175.554862 213.167580 260.691350 295.947197 330.942721
36 VII Bohol Tagbilaran-Dauis 9.667 123.850 80.0 229.1 1903-2018 POINT (123.85000 9.66700) Tagbilaran-Dauis 79.383611 35.628605 -0.000000 132.824380 159.561059 193.342924 218.404242 243.280510
37 VII Cebu Mactan 10.317 123.983 240.0 276.1 1972-2018 POINT (123.98300 10.31700) Mactan 100.786792 46.146602 -0.000000 170.003926 204.633598 248.388280 280.848003 313.068049
38 VII Negros Oriental Dumaguete 9.333 123.300 80.0 208.3 1910-2018 POINT (123.30000 9.33300) Dumaguete 68.189832 37.722110 -0.000000 124.770734 153.078437 188.845294 215.379192 241.717168
39 VIII Eastern Samar Borongan 11.667 125.450 30.0 427.8 1949-2018 POINT (125.45000 11.66700) Borongan 114.633119 84.186265 -0.000000 240.907464 304.083139 383.905772 443.122761 501.902501
40 VIII Eastern Samar Guiuan 11.036 125.742 2.1 780.4 1973-2018 POINT (125.74200 11.03600) Guiuan 115.505399 49.879834 -0.000000 190.322156 227.753348 275.047757 310.133451 344.960078
41 VIII Leyte Tacloban 11.228 125.028 3.0 325.9 1903-2018 POINT (125.02800 11.22800) Tacloban 106.673660 62.734474 0.250104 185.136299 214.633117 244.796605 262.978902 278.125007
42 VIII Northern Samar Catarman 12.500 124.633 70.0 485.8 1919-2018 POINT (124.63300 12.50000) Catarman 120.922927 75.665693 -0.000000 234.416925 291.198530 362.942239 416.165820 468.996407
43 VIII Southern Leyte Maasin 10.133 124.867 720.0 281.9 1973-2018 POINT (124.86700 10.13300) Maasin 92.111320 72.560691 0.328187 178.063758 207.565075 235.815243 251.768713 264.350324
44 VIII Western Samar Catbalogan 11.783 124.883 50.0 387.9 1919-2018 POINT (124.88300 11.78300) Catbalogan 119.914497 63.263942 -0.000000 214.806614 262.281606 322.266384 366.766519 410.938072
45 IX Zamboanga Del Norte Dipolog 8.600 123.350 50.0 295.8 1949-2018 POINT (123.35000 8.60000) Dipolog 94.207890 44.534760 -0.000000 161.007358 194.427460 236.653847 267.979793 299.074433
46 IX Zamboanga del Sur Zamboanga 6.917 122.067 60.0 199.1 1902-2018 POINT (122.06700 6.91700) Zamboanga 70.418530 31.358244 -0.000000 117.454014 140.986098 170.718947 192.776474 214.671131
47 X Bukidnon Malaybalay 8.150 125.133 6270.0 195.9 1949-2018 POINT (125.13300 8.15000) Malaybalay 96.501660 31.982074 0.324708 134.477245 147.564748 160.133897 167.252656 172.880533
48 X Misamis Oriental Lumbia-El Salvador 8.417 124.617 1880.0 237.1 1977-2018 POINT (124.61700 8.41700) Lumbia-El Salvador 66.985568 43.033360 -0.000000 131.533024 163.826434 204.629242 234.899097 264.945443
49 XI Davao Del Sur Davao City 7.126 125.646 29.3 242.6 1902-2018 POINT (125.64600 7.12600) Davao City 78.178818 39.424259 -0.000000 137.312840 166.897882 204.278661 232.009858 259.536293
50 XII South Cotabato General Santos 6.050 125.100 133.0 189.5 1949-2018 POINT (125.10000 6.05000) General Santos 50.478281 33.541494 -0.000000 100.788509 125.958964 157.761900 181.355134 204.774160
51 CARAGA Agusan Del Norte Butuan 8.950 125.483 460.0 271.6 1980-2018 POINT (125.48300 8.95000) Butuan 91.290882 37.164093 -0.000000 147.034792 174.923743 210.161507 236.302894 262.251256
52 CARAGA Surigao Del Sur Hinatuan 8.367 126.333 30.0 375.5 1949-2018 POINT (126.33300 8.36700) Hinatuan 118.894742 68.794405 -0.000000 222.082220 273.707423 338.936002 387.326289 435.359269
53 CARAGA Surigao Del Norte Surigao 9.783 125.483 550.0 566.4 1902-2018 POINT (125.48300 9.78300) Surigao 132.274502 83.462794 0.301759 232.963579 268.607535 303.506499 323.655165 339.842421
54 BARMM Maguindanao Cotabato 7.167 124.217 620.0 201.7 1986-2018 POINT (124.21700 7.16700) Cotabato 60.493031 40.181194 0.216483 111.955772 132.070411 153.230843 166.348737 177.536707

Visualization

The visualization below shows the amount of rainfall (mm) that comes once in every 5 years and once in every 100 years per weather station in the Philippines.

In [14]:
fig, (ax1, ax2) = plt.subplots(1,2)

# basemap
province.plot(ax=ax1, alpha = 0.1)
# point layer
plot_stations_pars.plot(column = 'rl_5',legend = True,vmin = 100, vmax = 600, ax = ax1)
# Title 
ax1.set_title("5-year Rainfall Return Levels (mm)")

# basemap
province.plot(ax=ax2, alpha = 0.1)
# point layer
plot_stations_pars.plot(column = 'rl_100',legend = True,vmin = 100, vmax = 600, ax = ax2)
# Title 
ax2.set_title("100-year Rainfall Return Levels (mm)")

# setting chart size
plt.gcf().set_size_inches(20, 10)

The left chart shows the amount of rainfall that comes every 5 years. The chart on the right is the amount of rainfall that comes every 100 years. Regions in the northern and eastern part receive more extreme rainfall levels.

Major caveat for this is that has not been cleaned yet and the date range and data availability are not the same for all stations, which makes some of the estimates inaccurate.


Provincial Level Estimates

Now that we have estimated the parameters and the return levels per weather station, it is more useful if the estimates are at a provincial level. For this, we will perform an interpolation procedure of the estimated parameters.

In [15]:
province.explore()
Out[15]:
Make this Notebook Trusted to load map: File -> Trust Notebook

In the plot below, the red points are the provincial centroids while the green points are the locations of the weather stations.

In [ ]:
province_centroid = province.centroid
fig, ax1 = plt.subplots(figsize=(10, 8))
province.plot(ax = ax1, alpha = 0.2, linewidth = 1)
province_centroid.plot(ax = ax1, color = 'red', label = 'Province Centroid')
plot_stations.plot(ax = ax1,color = 'green', label = 'PAGASA Weather Station')
leg = ax1.legend()
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:1: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

  """Entry point for launching an IPython kernel.

To recall, we have two geodataframes:

  • province, which contains the boundaries per each province; and
  • plot_stations_pars, which contains the location of each weather station and the GEVD parameters to be used in computation of T-year return levels.

For now, we will assume that the values estimated for the provincial centroid represents the whole province.

Spatial Interpolation of the Parameters

First, we define the interpolation method: Ordinary Kriging

$$\hat{x_0}=\sum_{i=1}^{m}\hat{w}_i x_i$$

where:

  • $\hat{x_0}$ is the estimate value on a location to be interpolated
  • $x_i$ the value at station $i$
  • $\hat{w}_i$ is the estimated weight of the station $i$, using OLS and variograms

We will use the pykrige package to accomplish the spatial interpolation.

In [ ]:
!pip install pykrige
import pykrige as pyk

Now, create the ordinary kriging object by plugging in the required data as arrays: the x and y coordinates, and the value to be interpolated.

We perform this three times, one per parameter (location $\mu$, scale $\sigma$, and shape $\xi$) to be interpolated.

In [17]:
# Create the ordinary kriging objects

# location \mu
OK_loc = pyk.OrdinaryKriging(
    np.array(plot_stations_pars['LON']),
    np.array(plot_stations_pars['LAT']),
    np.array(plot_stations_pars['location']),
    variogram_model="linear",
    verbose=False,
    enable_plotting=False,
)

# scale \sigma
OK_scale = pyk.OrdinaryKriging(
    np.array(plot_stations_pars['LON']),
    np.array(plot_stations_pars['LAT']),
    np.array(plot_stations_pars['scale']),
    variogram_model="linear",
    verbose=False,
    enable_plotting=False,
)

# shape \xi
OK_shape = pyk.OrdinaryKriging(
    np.array(plot_stations_pars['LON']),
    np.array(plot_stations_pars['LAT']),
    np.array(plot_stations_pars['shape']),
    variogram_model="linear",
    verbose=False,
    enable_plotting=False,
)

We now use the ordinary kriging object for the interpolation on arbitrary points (provincial centroids).

In [ ]:
# z is the estimate, ss is the variance
z_loc, ss_loc = OK_loc.execute(style = "points", 
                               xpoints = np.array(province.centroid.x), 
                               ypoints = np.array(province.centroid.y))

z_scale, ss_scale = OK_scale.execute(style = "points", 
                               xpoints = np.array(province.centroid.x), 
                               ypoints = np.array(province.centroid.y))

z_shape, ss_shape = OK_shape.execute(style = "points", 
                               xpoints = np.array(province.centroid.x), 
                               ypoints = np.array(province.centroid.y))
In [19]:
province["location"] = z_loc
province["scale"] = z_scale
province["shape"] = z_shape

Estimating Return Levels per province

We now then apply the qgevd function per province to estimate the corresponding 5, 10, 25, 50, and 100-year return levels.

Recall that the rows in the province dataframe represent a province.

In [20]:
# 5-year return level
province['rl_5'] = province.apply(lambda z: qgevd(1-1/5, z['location'], z['scale'], z['shape']), 
               axis = 1,result_type='expand')

# 10-year return level
province['rl_10'] = province.apply(lambda z: qgevd(1-1/10, z['location'], z['scale'], z['shape']), 
               axis = 1,result_type='expand')

# 25-year return level
province['rl_25'] = province.apply(lambda z: qgevd(1-1/25, z['location'], z['scale'], z['shape']), 
               axis = 1,result_type='expand')

# 50-year return level
province['rl_50'] = province.apply(lambda z: qgevd(1-1/50, z['location'], z['scale'], z['shape']), 
               axis = 1,result_type='expand')

# 100-year return level
province['rl_100'] = province.apply(lambda z: qgevd(1-1/100, z['location'], z['scale'], z['shape']), 
               axis = 1,result_type='expand')
In [21]:
province.head()
Out[21]:
ADM2_EN ADM2_PCODE geometry location scale shape rl_5 rl_10 rl_25 rl_50 rl_100
0 NCR PH133900000 POLYGON ((121.10389 14.51649, 121.10248 14.524... 123.206855 69.042307 0.083838 220.519355 264.800678 316.910100 352.977761 386.736153
1 Abra PH140100000 POLYGON ((121.11067 17.68489, 121.10883 17.739... 139.917852 85.125674 0.083838 259.899244 314.495875 378.744157 423.213759 464.836149
2 Agusan del Norte PH160200000 POLYGON ((125.74862 9.32604, 125.74822 9.32838... 96.231172 51.899371 0.083838 169.381358 202.667801 241.838657 268.950859 294.327175
3 Agusan del Sur PH160300000 POLYGON ((126.22779 8.00019, 126.22727 8.00119... 92.804721 49.376012 0.083838 162.398328 194.066377 231.332737 257.126738 281.269251
4 Aklan PH060400000 MULTIPOLYGON (((122.43948 11.60264, 122.43877 ... 102.579949 56.277272 0.083838 181.900620 217.994895 260.469955 289.869170 317.386070

Visualization per province

In [22]:
fig, (ax1, ax2, ax3, ax4) = plt.subplots(1,4, figsize = (28,7))

# 5-year return level
province.plot(column = 'rl_5', legend = True, vmin = 120, vmax = 500,ax = ax1)
ax1.set_title("5-year Rainfall Return Levels (mm)")

#10-year return level
province.plot(column = 'rl_10', legend = True, vmin = 120, vmax = 500,ax = ax2)
ax2.set_title("10-year Rainfall Return Levels (mm)")

#25-year return level
province.plot(column = 'rl_25', legend = True, vmin = 120, vmax = 500,ax = ax3)
ax3.set_title("25-year Rainfall Return Levels (mm)")

# 50-year return level
province.plot(column = 'rl_50', legend = True, vmin = 120, vmax = 500,ax = ax4)
ax4.set_title("50-year Rainfall Return Levels (mm)")
Out[22]:
Text(0.5, 1.0, '50-year Rainfall Return Levels (mm)')
In [23]:
province.plot(figsize = (8,8),column = 'rl_100', legend = True, vmin = 120, vmax = 500)
plt.title("100-year Rainfall Return Levels (mm)")
Out[23]:
Text(0.5, 1.0, '100-year Rainfall Return Levels (mm)')

Luzon and eastern portion of Visayas experience the most extreme rainfalls in the country, and more frequently.

For future studies, we can downscale this down to city level while considering numerous landscape features such as mountains, water bodies, infrastructure, land-cover characteristics, and components of the climate system.

This is particulary useful for the Philippines having a complex topography. For example, rainfall intensities recorded in Port Area, Manila is different from the rainfall intensities recorded in Science Garden, Quezon City, while being in the same region (NCR).

Summary

To summarize, these are the steps that we did:

  1. Extract recorded annual rainfall maxima per station
  2. Fit univariate GEV distribution on the data from each of the stations by estimating the parameters $(\mu, \sigma, \xi)$
  3. Using the estimated parameters from each station, estimate the the same parameters located on the provincial centroid by spatial interpolation (Ordinary Kriging).
  4. Map the 5-year, 10-year, 50-year, and 100-year rainfall return levels for each of the Philippine provinces.