EDA for Gas Fees - V2

In this notebook we perform Exploratory Data Analysis (EDA) on FIL's gas fee mechanism. The goal is to observe the gas fee as a signal and attempt to understand what may be driving it.

Questions to be answered in this notebook:

Potential steps:

Change log:

Background information: What are Gas Fees?

Note: this description is copied from the official Filecoin documentation

Executing messages, for example by including transactions or proofs in the chain, consumes both computation and storage resources on the network. Gas is a measure of resources consumed by messages. The gas consumed by a message directly affects the cost that the sender has to pay for it to be included in a new block by a miner.

Historically in other blockchains, miners specify a GasFee in a unit of native currency and then pay the block producing miners a priority fee based on how much gas is consumed by the message. Filecoin works similarly, except an amount of the fees is burned (sent to an irrecoverable address) to compensate for the network expenditure of resources, since all nodes need to validate the messages. The idea is based on Ethereum's EIP1559.

The amount of fees burned in the Filecoin network comes given by a dynamic BaseFee which gets automatically adjusted according to the network congestion parameters (block sizes). The current value can be obtained from one of the block explorers or by inspecting the current head.

Additionally, a number of gas-related parameters are attached to each message and determine the amount of rewards that miners get. Here's an overview of the terms and concepts:

GasUsage: the amount of gas that a message's execution actually consumes. Current protocol does not know how much gas a message will exactly consume ahead of execution, but it can be estimated (see prices (opens new window)). GasUsage measured in units of Gas.

BaseFee: the amount of FIL that gets burned per unit of gas consumed for the execution of every message. It is measured in units of attoFIL/Gas.

GasLimit: the limit on the amount of gas that a message's execution can consume, estimated and specified by a message sender. It is measured in units of Gas. The sum of GasLimit for all messages included in a block must not exceed the BlockGasLimit. Messages will fail to execute if they run out of Gas, and any effects of the execution will be reverted.

GasFeeCap: the maximum token amount that a sender is willing to pay per GasUnit for including a message in a block. It is measured in units of attoFIL/Gas. A message sender must have a minimum balance of GasFeeCap * GasLimit when sending a message, even though not all of that will be consumed. GasFeeCap can serve as a safeguard against high, unexpected BaseFee fluctuations.

GasPremium: a priority fee that is paid to the block-producing miner. This is capped by GasFeeCap. The BaseFee has a higher priority. It is measured in units of attoFIL/Gas and can be as low as 1 attoFIL/Gas.

Overestimation burn: an additional amount of gas to burn that grows larger when the difference between GasLimit and GasUsage is large.

The total cost of a message for a sender will be:

An important detail is that a message will always pay the burn fee, regardless of the GasFeeCap used. Thus, a low GasFeeCap may result in a reduced GasPremium or even a negative one! In that case, the miners that include a message will have to pay the needed amounts out of their own pockets, which means they are unlikely to include such messages in new blocks.

Filecoin implementations may choose the heuristics of how their miners select messages for inclusion in new blocks, but they will usually attempt to maximize the miner's rewards.

Data Resources

Sentinel Diagram

Data EDA

Below we download hourly averages from the messages and mesage_gas_economy table from May 1st, 2021 to present (last refreshed 6/28/2021). After downloading the data, we few the first and lasts 5 rows, and perform basic statistics on the data.

Data Dictionary - coped from Sentinel's Data Dictionary

derived_gas_outputs

Derived gas costs resulting from execution of a message in the VM.

Name Type Nullable Description
actor_name text NO Human readable identifier for the type of the actor.
base_fee_burn text NO The amount of FIL (in attoFIL) to burn as a result of the base fee. It is parent_base_fee (or gas_fee_cap if smaller) multiplied by gas_used. Note: successfull window PoSt messages are not charged this burn.
cid text NO CID of the message.
exit_code bigint NO The exit code that was returned as a result of executing the message. Exit code 0 indicates success. Codes 0-15 are reserved for use by the runtime. Codes 16-31 are common codes shared by different actors. Codes 32+ are actor specific.
from text NO Address of actor that sent the message.
gas_burned bigint NO The overestimated units of gas to burn. It is a portion of the difference between gas_limit and gas_used.
gas_fee_cap text NO The maximum price that the message sender is willing to pay per unit of gas.
gas_limit bigint YES A hard limit on the amount of gas (i.e., number of units of gas) that a message’s execution should be allowed to consume on chain. It is measured in units of gas.
gas_premium text NO The price per unit of gas (measured in attoFIL/gas) that the message sender is willing to pay (on top of the BaseFee) to "tip" the miner that will include this message in a block.
gas_refund bigint NO The overestimated units of gas to refund. It is a portion of the difference between gas_limit and gas_used.
gas_used bigint NO A measure of the amount of resources (or units of gas) consumed, in order to execute a message.
height bigint NO Epoch this message was executed at.
method bigint YES The method number to invoke. Only unique to the actor the method is being invoked on. A method number of 0 is a plain token transfer - no method exectution.
miner_penalty text NO Any penalty fees (in attoFIL) the miner incured while executing the message.
miner_tip text NO The amount of FIL (in attoFIL) the miner receives for executing the message. Typically it is gas_premium * gas_limit but may be lower if the total fees exceed the gas_fee_cap.
nonce bigint YES The message nonce, which protects against duplicate messages and multiple messages with the same values.
over_estimation_burn text NO The fee to pay (in attoFIL) for overestimating the gas used to execute a message. The overestimated gas to burn (gas_burned) is a portion of the difference between gas_limit and gas_used. The over_estimation_burn value is gas_burned * parent_base_fee.
parent_base_fee text NO The set price per unit of gas (measured in attoFIL/gas unit) to be burned (sent to an unrecoverable address) for every message execution.
refund text NO The amount of FIL (in attoFIL) to refund to the message sender after base fee, miner tip and overestimation amounts have been deducted.
size_bytes bigint YES Size in bytes of the serialized message.
state_root text NO CID of the parent state root.
to text NO Address of actor that received the message.
value text NO The FIL value transferred (attoFIL) to the message receiver.

message_gas_economy

Gas economics for all messages in all blocks at each epoch.

Name Type Nullable Description
base_fee double precision NO The set price per unit of gas (measured in attoFIL/gas unit) to be burned (sent to an unrecoverable address) for every message execution.
base_fee_change_log double precision NO The logarithm of the change between new and old base fee.
gas_capacity_ratio double precision YES The gas_limit_unique_total / target gas limit total for all blocks.
gas_fill_ratio double precision YES The gas_limit_total / target gas limit total for all blocks.
gas_limit_total bigint NO The sum of all the gas limits.
gas_limit_unique_total bigint YES The sum of all the gas limits of unique messages.
gas_waste_ratio double precision YES (gas_limit_total - gas_limit_unique_total) / target gas limit total for all blocks.
height bigint NO Epoch these economics apply to.
state_root text NO CID of the parent state root at this epoch.

messages

Validated on-chain messages by their CID and their metadata.

Name Type Nullable Description
cid text NO CID of the message.
from text NO Address of the actor that sent the message.
gas_fee_cap text NO The maximum price that the message sender is willing to pay per unit of gas.
gas_limit bigint NO -
gas_premium text NO The price per unit of gas (measured in attoFIL/gas) that the message sender is willing to pay (on top of the BaseFee) to "tip" the miner that will include this message in a block.
height bigint NO Epoch this message was executed at.
method bigint YES The method number invoked on the recipient actor. Only unique to the actor the method is being invoked on. A method number of 0 is a plain token transfer - no method exectution.
nonce bigint NO The message nonce, which protects against duplicate messages and multiple messages with the same values.
size_bytes bigint NO Size of the serialized message in bytes.
to text NO Address of the actor that received the message.
value text NO Amount of FIL (in attoFIL) transferred by this message.

We will focus on analyzing the methods by actor type using the derived gas outputs table. The data is very large, so we will perform individul SQL queries in order to obtain all of the required data.

For June 1 and June 2, what are the number of transactions by actor type?

Count of all actor types and methods used during June.

Time Analysis

When examining the data, we developed the following two questions:

  1. Is the distribution of intervals between data points consistent?
  2. What type of samplimg time do we have?

Workflow:

1. Verify sampling intervals
2. Verify distributions. If not normal, poisson, etc

To answer these questions, we will calculate the timestamp difference and create a histogram of the time differences and determine if we have equal time sampling or not.

Analyze Message meta data

Required Mapping to unblock Message Analysis

Data is method id crossed with actor id. We don't know what they respresent. We need to blow out to understand, this is overly flat. Need two separate keys, one for method, one for actor.

Below is currently all we have to go on:

https://github.com/filecoin-project/specs-actors/tree/master/actors/builtin

https://github.com/filecoin-project/specs-actors/blob/master/actors/builtin/multisig/multisig_actor.go#L52-L64

If this lookup table is correct, the vast majority of transactions are 6,7 and 5; which are RemoveSigner, SwapSigner, and AddSigner, respectively.

We will continue to focus on the derived_gas_outputs table with SQL queries to understand the why behind transactions as we do not have actor information easily accessible from the messages table.

Analyze message gas economy data

Exogenous signals

https://github.com/filecoin-project/sentinel/blob/master/docs/db.md

chain_economics

Economic summaries per state root CID.

Name Type Nullable Description
burnt_fil text NO Total FIL (attoFIL) burned as part of penalties and on-chain computations.
circulating_fil text NO The amount of FIL (attoFIL) circulating and tradeable in the economy. The basis for Market Cap calculations.
locked_fil text NO The amount of FIL (attoFIL) locked as part of mining, deals, and other mechanisms.
mined_fil text NO The amount of FIL (attoFIL) that has been mined by storage miners.
parent_state_root text NO CID of the parent state root.
vested_fil text NO Total amount of FIL (attoFIL) that is vested from genesis allocation.

chain_powers

Power summaries from the Power actor.

Name Type Nullable Description
height bigint NO Epoch this power summary applies to.
miner_count bigint YES Total number of miners.
participating_miner_count bigint YES Total number of miners with power above the minimum miner threshold.
qa_smoothed_position_estimate text NO Total power smoothed position estimate - Alpha Beta Filter "position" (value) estimate in Q.128 format.
qa_smoothed_velocity_estimate text NO Total power smoothed velocity estimate - Alpha Beta Filter "velocity" (rate of change of value) estimate in Q.128 format.
state_root text NO CID of the parent state root.
total_pledge_collateral text NO Total locked FIL (attoFIL) miners have pledged as collateral in order to participate in the economy.
total_qa_bytes_committed text NO Total provably committed, quality adjusted storage power in bytes. Quality adjusted power is a weighted average of the quality of its space and it is based on the size, duration and quality of its deals.
total_qa_bytes_power text NO Total quality adjusted storage power in bytes in the network. Quality adjusted power is a weighted average of the quality of its space and it is based on the size, duration and quality of its deals.
total_raw_bytes_committed text NO Total provably committed storage power in bytes. Raw byte power is the size of a sector in bytes.
total_raw_bytes_power text NO Total storage power in bytes in the network. Raw byte power is the size of a sector in bytes.

chain_rewards

Reward summaries from the Reward actor.

Name Type Nullable Description
cum_sum_baseline text NO Target that CumsumRealized needs to reach for EffectiveNetworkTime to increase. It is measured in byte-epochs (space * time) representing power committed to the network for some duration.
cum_sum_realized text NO Cumulative sum of network power capped by BaselinePower(epoch). It is measured in byte-epochs (space * time) representing power committed to the network for some duration.
effective_baseline_power text NO The baseline power (in bytes) at the EffectiveNetworkTime epoch.
effective_network_time bigint YES Ceiling of real effective network time "theta" based on CumsumBaselinePower(theta) == CumsumRealizedPower. Theta captures the notion of how much the network has progressed in its baseline and in advancing network time.
height bigint NO Epoch this rewards summary applies to.
new_baseline_power text NO The baseline power (in bytes) the network is targeting.
new_reward text YES The reward to be paid in per WinCount to block producers. The actual reward total paid out depends on the number of winners in any round. This value is recomputed every non-null epoch and used in the next non-null epoch.
new_reward_smoothed_position_estimate text NO Smoothed reward position estimate - Alpha Beta Filter "position" (value) estimate in Q.128 format.
new_reward_smoothed_velocity_estimate text NO Smoothed reward velocity estimate - Alpha Beta Filter "velocity" (rate of change of value) estimate in Q.128 format.
state_root text NO CID of the parent state root.
total_mined_reward text NO The total FIL (attoFIL) awarded to block miners.

Signal Analysis

VAR Analysis

As a final check prior to modeling, we will run the Augmented Dickey-Fuller test to ensure that our data is stationary (non-unit root - A unit root is a stochastic trend in a time series). The test's hypothesis are: