Block Science observed that there is a strong structural break in the onboarding time series before and after the China regulatory activity. This structural break appears to have disrupted the entire onboarding process (and the associated dependencies, such as gas usage) from a stationary process to a (likely transitory) non-stationary process. In other words, the current situation since early October has been one of the transients, creating volatility around a downward trend. This phenomenon has rendered the system identification techniques, which use the previous history and a technical assumption of its stationarity to create the DT’s extrapolation engine, less effective when applied to recent data (e.g. from October onwards). There is in-progress work on helping to improve the extrapolation, but we are also exploring running the DT simulations with the dynamic BatchBalancer on data before the China regulatory activity and then extrapolating forward. This not only assures us of the stationarity of the time series observed before the announcement (allowing us to continue using the VARx methodology for the gas dynamics) but also allows for a comparison of the extrapolated simulated system with the actual time series that occurred after the structural break.
A statistical test such as the Chow Test is performed on storage onboarding time-series data to determine, formally, if the Chinese regulation of the cryptocurrency market of September 24, 2021, caused a structural break (a structural break is a change in the parameters of regression models) at that time or thereafter. If a structural break is present, we will investigate to determine if the onboarding time series is stationary after the structural break. To test this hypothesis, we will use the Chow test.
The Chow test was created in 1960 by Gregory Chow, and it is a test that checks whether the actual coefficients in two linear regressions on different time series or different periods of the same time series are equal. It is most often used to check for a structural break that is known before analysis, as in our case with the Chinese regulations.
To illustrate how the Chow Test functions, we will enumerate below:
We have the following regression data: $y_t=a+bx_{1t} + cx_{2t} + \varepsilon.\$
If split into two groups, we have:
$y_t=a_1+b_1x_{1t} + c_1x_{2t} + \varepsilon \,$
and
$y_t=a_2+b_2x_{1t} + c_2x_{2t} + \varepsilon \,$
The null hypothesis of the Chow test asserts that $a_1=a_2$, $b_1=b_2$, and $c_1=c_2$ and that the errors and residuals I.D.D from a normal distribution with an unknown variance.
$S_C$ is the sum of squared errors and residuals from the combined data, with $S_1$ the sum of squared errors and residuals from the first group. $S_2$ is the sum of squared errors and residuals from the second group. $N_1$ and $N_2$ are the numbers of observations in each group while $k$ is the total number of parameters. Thus, the Chow test statistic is:
$$\frac{(S_C -(S_1+S_2))/k}{(S_1+S_2)/(N_1+N_2-2k)}$$The test statistic follows the F distribution wit$k$ and $N_1+N_2-2k$ the degrees of freedom.
Below, we will test the storage power_rb and the daily sum of gas used to check for structural breaks. We will set our breakpoint for September 24th, and use David Woroniuk's Python implementation of the Chow test.
References:
import os
os.chdir('..')
from filecoin_digital_twin.retrieve_data import pull_storage_data
import chow_test
from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# set start and end dates for data pull
start_date = datetime(2021,7,1)
end_date = datetime(2021,12,20)
truncation_interval = "DAY"
# load connection string
CONN_STRING_PATH = 'config/sentinel-conn-string.txt'
from sqlalchemy import create_engine
with open(CONN_STRING_PATH, 'r') as fid:
conn_string = fid.read()
# create database connection.
connection = create_engine(conn_string, pool_recycle=3600).connect()
# pull storage data
storage = pull_storage_data(truncation_interval,start_date,end_date)
# reset index from timestamp
storage.reset_index(inplace=True)
QUERY = """
SELECT
date_trunc('{}',
to_timestamp(height_to_unix(d.height))) AS datetime,
SUM(gas_used) as sum_gas_used_daily
FROM derived_gas_outputs d
WHERE
to_timestamp(height_to_unix(height)) BETWEEN '{}' AND '{}'
GROUP BY
datetime
ORDER BY
datetime asc
""".format(truncation_interval,start_date,end_date)
derived_gas_outputs_daily = (pd.read_sql(QUERY, connection))
# remove first and last days
derived_gas_outputs_daily = derived_gas_outputs_daily[1:-1]
# reset index from timestamp
derived_gas_outputs_daily.reset_index(inplace=True)
# storage data
chow_test.chow_test(X_series=storage['index'],
y_series=storage.power_rb,
last_index=85,first_index=86,significance=0.05)
Reject the null hypothesis of equality of regression coefficients in the two periods. Chow Statistic: 769.6132150177385, P_value: 1.1102230246251565e-16
(769.6132150177385, 1.1102230246251565e-16)
storage.plot(x='datetime',y='power_rb',title='Storage power_rb with structural break')
plt.axvline(x='2021-09-24', label='Regulation Shock',c='r')
plt.legend()
<matplotlib.legend.Legend at 0x7f8ad0664100>
chow_test.chow_test(X_series=derived_gas_outputs_daily['index'],
y_series=derived_gas_outputs_daily.sum_gas_used_daily,
last_index=84,first_index=85,significance=0.05)
Reject the null hypothesis of equality of regression coefficients in the two periods. Chow Statistic: 42.88689014843319, P_value: 9.992007221626409e-16
(42.88689014843319, 9.992007221626409e-16)
derived_gas_outputs_daily.plot(x='datetime',y='sum_gas_used_daily',title='Sum of daily gas used with structural break')
plt.axvline(x='2021-09-24', label='Regulation Shock',c='r')
plt.legend()
<matplotlib.legend.Legend at 0x7f8ad0df21f0>
From the Chow tests and plots, we can statistically say that a structural break occured in network onboarding following the new Chinese cryptocurrency regulations of September 24th, 2021. Based on the plots above, we can see that the effect is becoming more pronounced the further we are from the event as the impact reverberates through the network.