[NOTE2021.12.30]Privacy audit tool

We've been working at @ClassLambda on an anonymity tool for @TornadoCash.

Tornado Cash is a mixer that allows users to deposit Ethereum and then withdraw it with another account, breaking the on-chain link between them. Below, a thread with details of what we are doing:

  • For the @0xPARC applied ZK course, we had to do a lightning talk. For my talk I chose to ELI5 @TornadoCash.It's one of the simplest useful application of zkSNARKs and let me understand how snarks work in an real app A thread going into detail how it works under the hood:

    Tornado cash is a mixer that allows you to take some ETH and hide the link between the account you deposit in and the account you withdraw in. Tornado cash is a smart contract that holds the funds as well as a merkle tree of all participants.

    A merkle tree is a tree-like data structure that has data on the leaves and creates hashes all the way up til you get a root hash. It is an efficient data structure to prove inclusion in a set of participants without having to provide every single piece of data.

    When you deposit you create a secret offline and then hash that into a commitment. It is this commitment that is added to the merkle tree on-chain. When you withdraw you want to prove you are within the merkle tree so the contract allows you to withdraw your funds.

    To prove inclusion in a merkle tree, you normally have to provide the data itself, the sibling hash, plus all the sibling hashes going up until you get to the root hash However if you provided this, you would reveal which leaf you are and when that deposit was made.

    So on withdrawal instead of providing the actual data, you only provide a zk proof of inclusion. This proof is created using a secret you generated on deposit.

    However if you create a proof of inclusion using a zkSNARK, there is one issue. How does the contract know if you have withdrawn before? It doesn’t actually know who you are so it does not know if you have withdrawn in the past.

    This means there is a double spend problem that needs to be solved as a proof could be used multiple times to drain the contract of funds. The proof proves you are in the tree, but there is no way to know if the proof has been used before.

    This is solved by adding a salt to the commitment in addition to your secret. When you create your commitment on deposit, you are hashing both the secret and the salt together. They call this a nullifier in Tornado cash, which is a unique identifier of your commitment.

    When you generate your proof, this is also provided with your secret. And when you withdraw, your nullifier is also provided and is recorded in the smart contract so the proof cannot be used again.

    The last piece of the puzzle are relayers. Relayers allow the actual withdrawal to happen without any ETH in the new address. And additionally it doesn't allow any link to be created between the deposit address and withdrawal address (other than they both used Tornado)

    Relayers take the proof and submit them to the Tornado cash smart contract on your behalf. They cannot steal the ETH as the proof has been created with the recipient as an input. If the recipient is changed, this will invalidate the proof itself and the withdrawal will fail.

    Relayers are pivotal to keeping anonymity in Tornado Cash. They stop any link from connecting the deposit and withdrawal address. If you use the same address (or an address that links to it) you break all the anonymity the zk proof has helped you provide.

    In summary Tornado cash is a proof of inclusion in a merkle tree, where snarks are used to do the proof without revealing the leaf node you are and with double spend protection to stop the proof being used twice.

    For this tweet thread I mainly read the code at: https://github.com/tornadocash/tornado-core

    I would encourage anyone getting into zkSNARKs to take a look as the circuits are much more approachable than others I've seen.

    Also Tornado also uses @ensdomains for additional censorship resistance for all their relayers, which is pretty cool too! But not integral to Tornado cash working, so I'll just add it in here at the end!

    Original article from Twitter

    https://twitter.com/_jefflau/status/1468065457190350850

Currently, TCash lets you deposit a fixed amount of ETH in their pools. With that deposit you receive a note. Later, using that note you can withdraw your deposit to any wallet of your choice.

The idea is that, as many people deposited the same amount of currency in the same node, each withdrawal could have come from any of these people. This concept is called Anonymity Set.

Anonymity Set
Anonymity Set

As the number of deposits grows, the privacy of the pool grows too since each deposit is mixed with the rest. Unfortunately, there are many ways in which users can carelessly use Tornado Cash and compromise their privacy.

This is not only bad for the user who misused the application, but also diminishes the privacy of all depositors. Why is that?

Because if one can match a withdrawal with its deposit without uncertainty, one infer that the rest of the withdrawals is not linked to that deposit! In other words, that deposit will no longer be a part of the Anonymity Set.

This is why the TCash community launched a bounty in which we participated with a team from Stanford to help its users. We're creating a tool to let users know the true size of the anonymity set of each pool and point out if they are doing mistakes that de-anonymize them.

We developed a series of heuristics that allow us to link deposits and withdrawals. Below is an explanation of each one. We're working on more at the moment, we're migrating them to Julia and Pluto. Also you can see the implementation here!

Heuristic 1: reuse of the deposit address for withdrawal.

This is the most trivial heuristic: if a user deposits from a wallet and uses the same wallet to withdraw, their anonymity is completely nullified.

Heuristic 2: Usage of a unique gas price Many of the wallets have gas price.

recommendation systems. However if the user manually sets the amount of gas to pay, that amount will remain the default price that the wallet will use for other transactions.

So what happens if you find a deposit that has a unique gas price among all depots and then you find a withdrawal that has that specific, unique gas price? That's right! Those two transactions can be linked. That is the logic of our second heuristic.

Heuristic 3: Transactions outside Tornado Cash

The third heuristic is concerned with relating all deposit and withdrawal addresses that have transactions between them. Simply put, it can be assumed that wallets that interact with each other belong to the same entity.

The idea of this heuristic is simple, but its implementation is not, due to the large amount of computation required. It is necessary to traverse the entire Ethereum transaction graph, looking for transactions between addresses.

Heuristic 4: Multi-Denomination Reveal

Let's suppose that your source wallet mixes a specific set of denominations and your destination wallet withdraws them all (example: you mix 3x 10 ETH, 2x 1 ETH, 1x 0.1 ETH in order to get 32.1 ETH).

In this case the anonymity set will be reduced to only those depositing addresses that have made the same mix. And it would be completely annulled if no other wallet has mixed this exact denomination set. In this case all involved transactions would be linked.

Heuristic 5: Careless usage of anonymity mining

Anonymity mining is an incentive to increase the level of privacy (number of deposits) by rewarding participants anonymity points (AP) dependent on how long they left their assets in a pool.

Tornado cash rewards users with a fixed amount of anonymity points for each block spent with assets deposited into any pool. After having withdrawn their assets, users can claim their Anonymity Points. The amount withdrawn is recorded in the transaction.

We are still working on Heuristic 5 but we already linked 385 addresses!

So if a user uses the same withdrawal or deposit address to claim anonymity points, you can calculate the exact amount of blocks the assets were in the pool! Then you just calculate in which block the deposit or withdrawal should be in!

The heuristics were designed and implemented by @ClassLambda team (@herman_obst, Mariano Nicolini, Pedro Fontana) and @Istvan_A_Seres from the @eotvos_uni and Manuel Puebla from @UBAonline.

The information provided by these heuristics is used by an app that was created by Kaili Wang (https://github.com/kkailiwang), Mike Wu (https://github.com/mhw32), Will McTighe (https://github.com/Tiggy560) from @Stanford and @bax1337

(https://github.com/nickbax) from @convexlabs.

They had to modify a graph clustering algorithm based in this paper https://fc20.ifca.ai/preproceedings/31.pdf… to cluster ETH addresses. They also added Diff2Vec https://arxiv.org/pdf/2001.07463.pdf… to always be able to find clusters in Ethereum addresses.

The application is still being developed but you can check it https://tutela.xyz. Tutela will show related addresses and anonymity score.

It has been an interesting multidisciplinary work between people from different continents. It isn't easy to work with people you don't know, but ain't the case when there is a common goal. This work of the @TornadoCash community is another example of the power of DAOs.

We are writing a post explaining this for our blog https://notamonadtutorial.com In the coming weeks we will write a chapter about this experience in our book https://datasciencejuliahackers.com. If you liked this thread please follow

@federicocarrone or send me a PM if you want to hire us.

Original article from Twitter

Subscribe to DanielHill
Receive the latest updates directly to your inbox.
Verification
This entry has been permanently stored onchain and signed by its creator.