32
Streaming Ethereum Blocks Into bSQL Using Infura and Python
Blockchain data is secure and tamper proof, but when working with blockchain data in a more traditional environment, let's say a conventional database, it becomes harder to extend these guarantees. Can data be trusted after it has left the blockchain?
By using a less traditional form of DBMS, an immutable database, we can:
- Verify that data hasn't been illicitly changed
- Track all changes made to the system
- Easily access old versions of the system
In this tutorial I will be using Blockpoint's bSQL, because it's immutable, relational, structured, and has a rich language. The bSQL storage structure is actually very similar to that of a blockchain in that data pages are hashed and linked together. Data is added, never deleted:
We will be using an Infura free trial to access the Ethereum network, Python to filter for new blocks, and bSQL to store blockchain data. In order to do so we will:
- Sign up for a free Infura account and obtain an Ethereum endpoint.
- Create a bSQL account and deploy a free instance.
- Write a python script and start streaming Ethereum blocks to the database.
At any time you can reference the public repo
Register for a free Infura account here. This will give you access to 100,000 Requests/Day, which is plenty given that this is just a demonstration.
Once you've set up your Infura account, you can access your project ID, it will be needed to connect to Infura using an endpoint and can be found under your project name.
The endpoint for accessing the data will resemble the following:
https://mainnet.infura.io/v3/your_project_id
We will use this endpoint when we set up our python application.
The next step is to set up our bSQL instance by:
- Deploying a database using the Blockpoint Portal
- Opening the instance in the IDE
- Creating a database and a blockchain
In order to create a bSQL account you will need an a unique access token, you can get your access token by messaging me directly or joining the slack, it's free!
1.) The tutorial for creating an account and deploying your first instance can be found here. Once completed, a new instance should appear on the blockpoint portal home page.
2.) Once created, navigate to the home page. To open the IDE, click "Open in IDE" and, when prompted, provide your database credentials.
3.) Finally we are going to run a few bSQL commands to finish our set up.
a. Create a new database called "eth" by running CREATE
DATABASE eth;
b. Interact with the newly created database by running USE
eth;
Next, we are going to want to configure a single blockchain for capturing Ethereum data. A blockchain is a structured container for storing data in bSQL. Once data has been added to the system, it cannot be removed. For a more comprehensive overview on the blockchain structure read the documentation here.
For the sake of keeping this tutorial simple, we are going to use a single blockchain called blocks to track new blocks added to the Ethereum network. Using a historical blockchain we can enforce immutability and check data integrity. Deploy the blockchain by running the following command in the IDE.
CREATE BLOCKCHAIN blocks HISTORICAL (
time TIMESTAMP,
number UINT64,
hash STRING SIZE=66,
parent_hash STRING SIZE=66,
nonce STRING SIZE=42,
sha3_uncles STRING SIZE=66,
logs_bloom STRING PACKED SIZE=18000,
transactions_root STRING SIZE=66,
state_root STRING SIZE=66,
receipts_root STRING SIZE=66,
miner STRING SIZE=42,
difficulty FLOAT64,
size_of_block INT64,
extra_data STRING PACKED,
gas_limit INT64,
transaction_count INT64,
base_fee_per_gas INT64
);
Congrats on building your first bSQL blockchain! Now let's start adding data.
In order to set up your python applet you will need the following:
- Python downloaded and installed on your computer, I'm using python 3.9.
- A python IDE, I'm using Pycharm
- web3 for python
- The bSQL python bSQL database driver
Once the follow criteria are met, you can set up a main.py file in your project directory. The code is on Github although it will not work until the above criteria is met and you have deployed an instance.
The first step in our code is to define our connections. You will need to fill out the following fields:
- your Infura project id
- your bSQL username, password and public IP address
infru = "https://mainnet.infura.io/v3/your_project_id" #change me
web3 = Web3(Web3.HTTPProvider(infru))
conn = mdb_bp.driver.connect(
username="your username", #change me
password="your password", #change me
connection_protocol="tcp",
server_address="server address", #change me
server_port=5461,
database_name="eth",
parameters={"interpolateParams": True},
)
The main method defines a filter for the latest Ethereum block and passes it into our loop that we will define next.
def main():
block_filter = web3.eth.filter('latest')
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(
asyncio.gather(
log_loop(block_filter, 2)))
finally:
# close loop to free up system resources
loop.close()
if __name__ == '__main__':
main()
Our loop sleeps for a desired interval, then attempts to pull new entries from the event filter. Every time a new entry is received the event is handled in the handle_event
function.
async def log_loop(event_filter, poll_interval):
while True:
for PairCreated in event_filter.get_new_entries():
handle_event(web3.eth.get_block(PairCreated))
await asyncio.sleep(poll_interval)
Time for the database call. Every time a new block is added to the chain, we print to the console and send an insertion statement to the database, inserting block data into blocks.
def handle_event(block):
print(block['number'])
conn.exec("INSERT blocks VALUES (" +
"\"" + str(datetime.datetime.utcfromtimestamp(block['timestamp'])) + "\"," +
str(block['number']) + "," +
"\"" + str(block['hash'].hex()) + "\"," +
"\"" + str(block['parentHash'].hex()) + "\"," +
"\"" + str(block['nonce'].hex()) + "\"," +
"\"" + str(block['sha3Uncles'].hex()) + "\"," +
"\"" + str(block['logsBloom'].hex()) + "\"," +
"\"" + str(block['transactionsRoot'].hex()) + "\"," +
"\"" + str(block['stateRoot'].hex()) + "\"," +
"\"" + str(block['receiptsRoot'].hex()) + "\"," +
"\"" + str(block['miner']) + "\"," +
str(block['difficulty']) + "," +
str(block['size']) + "," +
"\"" + str(block['extraData'].hex()) + "\"," +
str(block['gasLimit']) + "," +
str(len(block['transactions'])) + "," +
str(block['baseFeePerGas']) + ")")
And that's all the code needed, so give that baby a run.
After letting my program run for about an hour, I stopped my script and started to do a little data exploration.
I ran a few queries in the bSQL portal and included them in the repo. You can load this file into the bSQL IDE or write your own queries.
There you have it. A fun little script for Ethereum data. There is definitely more to explore when it comes to how the data is pulled and even more queries to write.
Like always, please comment your feedback or any questions you may have.
32