June 9th, 2022

Nowadays, Web3 startup teams have increasingly complex needs for handling blockchain data and are no longer satisfied with just building dashboards. For example, build machine learning models to serve prediction or recommendation scenarios, run graph analysis algorithms to find abnormal transaction addresses, or provide metrics data for upper-layer business. Whichever of the above methods means one must build a blockchain warehouse based on on-chain data.

But building a stable, easy-to-use blockchain warehouse is no easy task. ethereum-etl and ethereum-etl-airflow are two excellent open-source projects. Users can get full volume data directly through ethereum-etl and build a pipeline of T+1 data through ethereum-etl-airflow. However, having them alone is just half of the story.

Build blockchain warehouse with open-source components

This is a complete data processing pipeline: