TrueBit – White Paper

Highlight to Annotate. Click to Learn.

 

A scalable verification solution for blockchains

Jason Teutsch TrueBit Establishment jt@truebit.io

Christian Reitwießner Ethereum Foundation chris@ethereum.org

November 16, 2017

 

 

Abstract

Bitcoin and Ethereum, whose miners arguably collectively comprise the most powerful computational resource in the history of mankind, offer no more power for processing and verifying transactions than a typical smart phone. The system described herein bypasses this bottle- neck and brings scalable computation to Ethereum. Our new system consists of a financial incentive layer atop a dispute resolution layer where the latter takes form of a versatile “verification game.” In addi- tion to secure outsourced computation, immediate applications include decentralized mining pools whose operator is an Ethereum smart con- tract, a cryptocurrency with scalable transaction throughput, and a trustless means for transferring currency between disjoint cryptocur- rency systems.

Contents

Securing computations with economics2

  1. How TrueBit works5
    1. System properties 7
    2. Assumptions 8
    3. Attacker model 9
  2. Dispute resolution layer10
    1. Bottleneck: The Verifier’s Dilemma 11
    2. Solution: The verification game 12
    3. Detailed protocol 13
    4. Runtime and security analysis 16

Incentive layer17

Defenses26

Implementation36

Applications37

Addendum40

Securing computations with economics

Every Nakamoto consensus-based cryptocurrency (e.g. Bitcoin or Ethereum) offers something that everyone in the world agrees on: a definitive, public ledger of financial transactions, or blockchain. This consensus technology en- ables basic Bitcoin transactions, which transfer currency from one party to

another, while Ethereum transactions perform financial and database oper- ations contingent on the evaluation of more complex computational scripts. Anonymous miners, who freely join and leave cryptocurrency networks, de- termine the validity of these transactions and thereby establish who owns which coins. Remarkably, this verification scheme requires no central au- thority to operate securely.

In practice, miners can successfully maintain blockchain integrity so long as the computational burden of verification remains minimal. Indeed, Bit- coin and Ethereum, whose miners arguably collectively comprise the most powerful computational resource in the history of mankind, offer no more power for processing and verifying transactions than a typical smart phone. One cannot simply increase the volume or complexity of transactions flow- ing into the blockchain without risking inclusion of invalid transactions due to the so-called Verifier’s Dilemma [50] (see Section3.1). In 2015, the Ver- ifier’s Dilemma manifested itself in the form of the July 4 Bitcoin fork [35] which temporarily derailed the entire network. The Ethereum community also witnessed a related exploit in its 2016 denial-of-service attacks [16].

A consensus computer [50] permits users to outsource computations to the Ethereum network and receive correct answers in exchange for payment so long as the effort required to verify solutions does not exceed the threshold induced by the Verifier’s Dilemma. Thus the trustless consensus computer offers a small but reliable kernel of semantic truth.

Our contribution. We present here a system, called TrueBit, which am- plifies the consensus computer’s capabilities. TrueBit enables trustless smart contracts, in theory, to securely perform any computation task. Moreover, TrueBit vastly reduces the number of redundant network node computations used in traditional Ethereum smart contracts. Presently every Ethereum miner has to independently replicate each smart contract action in its en- tirety, whereas TrueBit outsources most computation work to a handful of entities. In this way, TrueBit makes secure computing affordable.

Outsourced computation

Let us ponder for a moment why cryptocurrencies might offer an especially convenient framework for secure outsourced computation, the core appli- cation of TrueBit. In the traditional cloud models, users must trust that the machine hardware, software, and cloud administrator all perform as ex- pected. A wide range of things can go wrong, particularly when one wishes to tie the results of such computations to monetized entities such as smart

contracts. Proper economic incentives, the cornerstone of any cryptocur- rency, can deter many types of errors from occurring in ways that simple task repetition cannot. Furthermore, in contrast to a cloud whose configu- ration details may not be visible to users, any systemic network errors that might occur on a blockchain-based system like TrueBit would appear in plain sight to the entire community. Cryptocurrencies indeed provide an excellent starting point as they already achieve several desirable properties.

      1. As witnessed by the Ethereum consensus computer, Nakamoto con- sensus grants networks the ability to trustlessly perform small compu- tations correctly.
      2. Blockchain public ledgers provide perfect transparency and immutabil- ity. Ethereum smart contracts inherit these characteristics.
      3. Currency incentives, which in cryptocurrencies are inextricably tied to computational processes, can be used to recruit and reward partic- ipants.

In general, economic forces greatly reduce the scope of possible network errors. We can safely assume that errors which incur economic penalties are much less likely to occur than those which do not.

Practical impact

A market for work related to computationally-intensive smart contracts al- ready exists. The Golem Project, which crowdfunded 820,000 ETH on the morning of November 11, 2016 (∼$8.6 million on that date), already cites TrueBit as a verification mechanism for their forthcoming outsourced com- putation network [14]. TrueBit has the potential to support many practical new applications. Here are some examples.

  • Outsourced computation. TrueBit functions as a worldwide computa- tion market. Anyone in the world can post a computational task, and anyone else can receive a reward for completing it. The system’s incentive structure guarantees correctness of returned solutions.
  • Truly decentralized mining. Centralized mining pools pose security threats. For any Nakamoto consensus-based cryptocurrency, one can use TrueBit to build a practical, efficient, and trustless mining pool managed by an Ethereum smart contract (see also [51]).
  • Trustless cryptocurrency exchange. Currently users must find trusted exchange partners to transfer currency (or messages) between block- chains. TrueBit can facilitates secure transfer, for example, of doge- coins to the Ethereum blockchain and back (modulo some new opcodes for dogecoin) [8].
  • Scalable blockchain. By decoupling verification for miners into a sep- arate protocol, we can achieve high transaction throughput without facing a Verifier’s Dilemma.
  • Scalable “on-chain” storage. Swarm [28] is developing a platform for incentivized, peer-to-peer storage. TrueBit can make Swarm data ac- cessible to Ethereum smart contracts.

We discuss these ideas further in Section7.

Smart contracts

Smart contracts, introduced in 1994 by Szabo [61] and first realized in the Ethereum [11] cryptocurrency in 2015, are uncensorable programs that live on Ethereum’s blockchain and have their own executable code and internal states, including storage for variable values and ether currency balance. Pro- posed applications for smart contracts include outsourced computation and storage [13,28,50], prediction markets [2], decentralized venture funds [29], and Bitcoin mining pools [27,51]. Ethereum’s variable gasLimit restricts the number of computation steps that smart contracts can perform.

TrueBit itself is an Ethereum smart contract which allows users to call TrueBit contracts for trusted, computationally-intensive applications. Tradi- tional Ethereum smart contracts can call a TrueBit contract as a subroutine, thereby effectively bypassing Ethereum’s gasLimit. In other words, TrueBit increases the per block work that the Ethereum network can process cor- rectly.

How TrueBit works

TrueBit’s primary purpose is to realize correct, trustless computations de- spite miners’ limited computation bandwidth. Intuitively, we wish to reward participants who correctly perform computational tasks, but who decides whether these tasks were done correctly? In absence of a dispute, the party who performs a computational task on behalf of a TrueBit contract simply receives a reward. On the other hand, if a dispute does occur, we must

rely on the only trusted resource, the limited network of miners, to resolve it. Dispute resolution occurs as a “verification game” subroutine in TrueBit, and we will discuss its details in Section3.

Who initiates a dispute in the case of an incorrect computation, and how can we ensure that such a dispute will actually occur? One option is to assume that each party with some stake in the contract brings their own trusted (although not necessarily mutually trusted) verifier, and flags for a challenge if needed. This trusted verifier approach suffices for some applications, but in general parties may need to rely on the results of a previously executed TrueBit contract ex post facto even if they themselves did not exist at the time of its creation. Like Ethereum smart contracts, TrueBit contracts must have universal validity.

Since there exist no trusted parties on Ethereum’s network, by symmetry we must allow any party to be hired to solve any computational task, and similarly anyone should be able to challenge a Solver’s outcome of a com- putational task. The latter requirement ensures that TrueBit operates by unanimous consensus. Assuming that an honest and motivated Challenger exists who will challenge any wrong answer given by the Solver, the system’s incentives for the Solver are straightforward: reward correct solutions. But how can we make sure that such a motivated Challenger exists?

Idea 1. Offer rewards for checking computations.

While this strategy may encourage participation, it provides no incentive for correct verification. We have no a priori reason to trust that the Verifier will substantively inspect the computation task, and we gain no additional assurance by increasing her reward for checking.

Idea 2. Offer rewards for finding bugs.

While a bug bounty may incentivize a Verifier to correctly perform a check, it only does so when the Verifier has some reason to believe that a bug might actually exist. Unless a potential Verifier believes that she has some real chance to find a bug, in practice we cannot expect her to participate in the system. In summary, we need both an incentive to partake in verification and an incentive to perform checks correctly. This leads us to the following proposition.

Idea 3. Offer a bug bounty and provide expectation that bugs will exist.

TrueBit takes this third approach by occasionally forcing Solvers to sub- mit incorrect solutions. During “forced errors” TrueBit reverses the normal

system incentives: the Solver gets paid for submitting an incorrect solu- tion and penalized for submitting a correct one. We discuss the details of TrueBit’s forced error protocol in Section4.

Let us return to our dispute resolution subroutine. At the heart of TrueBit’s protocol lies an interactive “verification game” which decides whe- ther or not a contested computational task was performed correctly (see Section3). The verification game proceeds through a series of rounds, where each round recursively checks a smaller and smaller subset of the compu- tation. A trusted network, in our case the Ethereum platform [11], merely enforces the rules of the game and therefore does not bear the bulk of the verification burden. Anyone on the network can generate tasks, compute, or verify in exchange for rewards. TrueBit does not require “honest” or “altru- istic” nodes for correct operation but rather runs under the assumption that each node wishes to maximize its own profit. We shall discuss the details of TrueBit’s incentives for participation in Section4.

System properties

Any outsourced computation system should be fair in the sense that parties who perform computational tasks indeed receive compensation for their work and reliable in the sense that paid computations are performed correctly. In addition to these properties, TrueBit also ensures accessibility through easy joining or exiting of the verification ecosystem. Any anonymous party can play the role of Task Giver, Solver, or Verifier, where a Verifier is a party who checks solutions and becomes a Challenger whenever she reports an error. In particular, TrueBit does not trust or rely on the reputation of its participants. Everyone who puts a deposit into the system has a fair chance to be hired for a given computational task.

TrueBit offers several novel advantages over traditional cloud computing

and verifiable computable models. Verifiable computing ensures a correct answer for an outsourced computation by forcing the cloud to provide a short proof which witnesses the correctness of its computation. The idea is that this “short proof” should be much easier to check than performing the computation oneself. Researchers have achieved much progress on this method in recent years, however the cryptographic setup costs and computa- tional overhead for the cloud in state-of-the-art systems make these methods unsuitable for most real-world applications. Moreover, many of the proof- based systems to-date, including Zaatar, Pinocchio, Ginger, and TinyRAM, require one to run thousands of instances of a single function before breaking even on the overall verification time for a 128 × 128 matrix multiplication

puzzle [65]. The new cryptocurrency Zcash [34] successfully makes use of verifiable computing machinery, albeit only for a very restricted class of computation tasks. Below we contrast TrueBit with verifiable computing and traditional cloud computing.

      1. Incentives. Unlike traditional cloud computing, where the user simply trusts the cloud to provide correct answers, TrueBit provides financial incentives to ensure correctness.
      2. Transparency. The entire inner workings of TrueBit’s interpreter sit on the blockchain and are open for inspection (see Section6). Further- more, the user community can democratically update the interpreter as needed.
      3. Efficiency. Solvers in TrueBit have low computational overhead and minimal initial setup costs. The verification game (Section3) does introduce some extra work, but in practice, due to high penalties for wrong answers and bogus challenges, we expect participants to appeal to the verification game only rarely, if at all.
      4. Simplicity. TrueBit’s operation is relatively straightforward. Unlike traditional verifiable computing, TrueBit avoids deep probabilistically checkable proofs (PCPs), succinct non-interactive arguments of knowl- edge (SNARKs) [37], and exotic cryptographic assumptions (e.g. those used in zkSNARKs [38]). The standard cryptography used in TrueBit, namely hash functions and digital signatures, sits in the underlying blockchain network and does not surface in the TrueBit protocol itself.
      5. Adaptability. TrueBit runs on top of Ethereum’s current protocol with- out impacting functionality.
      6. Keyless entry. Participants do not need to manage cryptographic keys beyond those used in Ethereum. TrueBit establishes identities through financial deposits alone, and therefore the system avoids risks from cryptographic trapdoors.

Assumptions

Traditional distributed systems models focus on tolerance against arbitrary, Byzantine errors. In Ethereum and other Nakamoto consensus-based cryp- tocurrencies, however, we have no reason to expect arbitrary errors—nodes generally deviate from protocol due to financial incentives. Our system model makes two basic assumptions about the underlying network.

  1. There exists a trusted network (i.e. Ethereum) that correctly performs very small computation tasks.
  2. Participants are rational in the sense that they act to maximize their individual profits. In particular, we expect that CPUs will show up to do computational tasks if and only if they expect fair compensation for their work.

The consensus computer [50] shows how one can use the incentives in Ether- eum to establish assumption (i) above both in theory and in practice. The task of our new system will be to amplify this small amount of correct com- putation in order to handle larger tasks using assumption (ii). Even though solo Ethereum miners rarely earn mining rewards, expected long-term re- turns suffice to merit their participation in the system (via mining pools). As we shall discuss in Section3.1, anonymous parties may not necessarily perform correct computations when economic incentives, including conser- vation of CPU cycles, pressure them to do otherwise. This observation justifies assumption (ii). We emphasize that we do not find the assumption that there exists a single honest (and non-lazy) participant [46] to be real- istic. A person who loyally contributes CPU cycles without incentive to do so most likely does not exist in a large, general purpose, public system.

This paper takes a somewhat simplified view of Nakamoto consensus. In some cryptocurrencies, such as Bitcoin, miners can selectively censor a class of transactions by deciding not to include its members in their blocks. In the case of TrueBit, censorship of challenges (see Section4) or transactions in the verification game (Section3.2) could result in catastrophic, bogus solutions being accepted on the blockchain. In Ethereum, however, censor- ing individual transactions is difficult as miners cannot easily see where an Ethereum transaction might call without executing it. Indeed Ethereum users can obfuscate the function of their transactions, making it computa- tionally costly for miners to hand-pick transactions for their blocks. Hence we shall assume that censorship of individual transactions does not happen. In Section5.1, we analyze an attack on TrueBit in which miners withhold entire blocks and show that it is not profitable.

Attacker model

TrueBit’s security relies on two basic assumptions about the behavior and capabilities of potential attackers.

  1. Attackers cannot subvert the underlying consensus computer. The

Judges in the verification game (Section3) always adjudicate correctly, as do the Referees in the incentive layer (Section4).

  1. Attackers are rational. We assume that attackers have limited financial resources and will not deviate from protocol unless it is profitable for them to do so.

While an adversary with sufficient computational power can derail a Naka- moto consensus-based cryptocurrency [56], assumption (i) above neatly sweeps this issue outside our domain. TrueBit itself does not utilize proof- of-work. Although we have chosen to implement TrueBit in the Ethereum platform for convenience, we could easily replace the “Judges” from (i) with any other universally trusted computing mechanism without affecting the system’s functionality.

Due to anonymity of all participants, Sybil attacks, in which a single party deviantly assumes multiple identities on the network, pose special threats to TrueBit (see Section5.1and Section5.2). We will assume that an attacker may create as many identities as she likes, including simultaneously playing the roles of Task Giver, Solver, and Verifier, appearing as two dis- tinct Verifiers, or managing multiple identities via pooled financial resources (see Section5.3). Using assumption (ii) above, we shall argue that TrueBit resists such attacks in Section5.

The “forced errors” described in the beginning of this section (Section2) pose a special challenge to TrueBit due to opportunistic attacks. As the prize for discovering forced errors is necessarily enormous, attackers may seek ways to obtain these large prizes prizes without performing the intended verification tasks (Section5.4). For this reason, the occurrence of forced errors must appear truly random to all prospective Verifiers. Moreover, Solvers must be incentivized from prematurely sharing the fact that their error was forced or claiming that an error was forced when it really wasn’t. In general, we want consistent, not sporadic, verification. At least one Verifier should be present for every task. The attraction of forced errors represents the “Achilles’ heel” of TrueBit, and we devote the incentive structure in Section4to careful defending of this potential attack surface.

Dispute resolution layer

In this section we present and analyze TrueBit’s dispute resolution mecha- nism. TrueBit relies on network “Judges” with limited computational power

who adjudicate all disputes, and we show how such Judges can correctly resolve disputes over complex computations.

Bottleneck: The Verifier’s Dilemma

Ethereum’s incentive structure severely restricts the amount of computation work that smart contracts can accurately enforce. Users interact with smart contracts by broadcasting transactions to the network. Such transactions contain program scripts which require computational work to verify their validity prior to miners’ acceptance on the blockchain. Each miner must not only locally verify but also remember the state of every single smart contract. On the surface, this redundant work limits smart contracts’ computational scope due to inefficiency, but the true bottleneck lies deeper at the incentive level.

Let us recall miners’ incentives for participating in Ethereum. In Naka- moto consensus-based cryptocurrencies [56], new transactions enter the block- chain through a process called mining. Participating miners maintain the integrity of the blockchain data by racing to solve computation-intensive, proof-of-work puzzles in exchange for rewards. The first miner who success- fully broadcasts a solution to the current proof-of-work puzzle proves that she has spent the necessary computation power to merit appending her new set of transactions to the blockchain, and this step awards the miner a set of newly minted coins. Then the race begins again on top of this new block.

When an Ethereum user broadcasts a transaction to the network, the miner who appends her valid transaction to the blockchain receives a trans- action fee for the service. On the other hand, Ethereum’s mining protocol dictates that other miners should verify this transaction gratis, for the “com- mon good” so to speak. More specifically, the mining protocol dictates that the longest chain consisting of valid transactions is the correct one, and miners should mine on top of it. If the time to verify new blocks is negli- gible, miners may indeed agree to check them, but if the verification work becomes substantial, then miners risk falling behind in the mining race by following protocol and actually verifying. In order to save time, rational miners might simply skip the verification step, leaving the blockchain open to invalid transactions. On the other hand, by skipping verification, a miner might also waste her CPU cycles by mining on top of a stale chain which the rest of the community deems invalid. Either way, the miner risks forfeiting rewards. A rational miner doesn’t know what to do!

Any substantial verification burden placed on miners thus results in a

Verifier’s Dilemma [50]. Because of the Verifier’s Dilemma, smart contracts

whose verifications require non-trivial computational efforts will fail to ex- ecute correctly as rational miners, who may choose to deviate from the suggested protocol, stop donating their limited resources to thankless ver- ification tasks. In short, Ethereum lacks a scalable verification mechanism and remains vulnerable to serious attacks [50] when miners have to verify more than a tiny amount.

TrueBit contracts securely execute computationally-intensive tasks for use in Ethereum smart contracts. The system’s core consists of a smart contract interface where a user can request a solution to a computational task or puzzle, and anyone can solve it in exchange for a reward. Our inter- active verification game, described below, empowers users to both outsource arbitrary computations in a decentralized manner and receive back correct solutions. To achieve this goal, we will not only reduce the redundancy of verification work but also refine the incentive structure responsible for the Verifier’s Dilemma.

Solution: The verification game

The goal of TrueBit’s interactive verification game is to resolve a given dis- pute between a Solver, who has provided a solution to some computational task, and a Challenger, who disagrees with the solution and therefore calls the verification game to be played. The outer layer of TrueBit (Section4) uses the verification game as a subroutine. The roles in the verification game include a Solver, who offers a solution to a given task, and a Chal- lenger who disagrees with the Solver’s solution. The final party, the Judges, always perform computations correctly but possess extremely limited com- putational bandwidth. The Judges in TrueBit are the entire community of Ethereum miners who reach verdicts through Nakamoto consensus.

The verification game proceeds in a series of rounds, each of which nar- rows down the portion of the computation in dispute. In the first round, the Challenger forces the Solver to commit to a pair of computation steps delimiting some time interval. In the next round, the Challenger iteratively challenges a subset of the Solver’s computation over this time interval, and next she challengers over a subset of that subset, etc. until in the final round the final challenge is sufficiently trivial that the Judges can make a final rul- ing on whether the challenge was justified. The Judges also enforce that the Solver and Challenger follow the rules of the game. At the end of the verification game, either the cheating Solver will be discovered and punished in the outer layer of TrueBit (Section4), or the Challenger will pay for the resources consumed by the false alarm.

Example (matrix multiplication [46]). We give an illustrative example of a verification game. Suppose that the given task is to compute the product of two matrices A and B. The Solver claims that A × B = C. In the first round, the Challenger must claim an error in one of the entries in C, and outputs coordinates i, j such that

n

Σ

ci,j ƒ= ai,k · bk,j

k=1

with corresponding evidence consisting of partial sums d0, d1, . . . , dn where

m

Σ

dm = ai,k · bk,j .

k=1

The Judges only verify that i, j are coordinates and that dn ƒ= ci,j and that d0 = 0. The Challenger loses if this fails to hold. In the second round, the Solver tries to defend himself by providing a k such that dk ƒ= dk−1 +ai,k ·bk,j . If the Judges can verify this claim then the Solver wins the game, and otherwise the Challenger wins.

Detailed protocol

We can use a variation of the Example above to check arbitrary computa- tions by creating a binary search to pinpoint the erroneous (or not erro- neous) step in a computation. The idea for the following verification game is essentially due to Canetti, Riva, and Rothblum [39,40] and was later in- dependently discovered by the authors of this paper [46,58]. Canetti, Riva, and Rothblum did not consider this game in the context of blockchains but rather a simpler scenario in which a user outsources a computation to k dif- ferent clouds under the assumption that at least 1 out of k of the clouds will perform the computation correctly. The assumption of at least one trusted Solver/Challenger cloud is too strong for our purposes since in a purely ra- tional network, such a Solver/Challenger may not exist. Section4describes our workaround for the smart contract setting.

For purposes of reducing the number of rounds of interaction and space demands on the blockchain, our protocol roughly follows [39,40] which com- bines the (parametrized) binary search optimization from [46, Proposition 9] with the probabilistic Merkle tree construction from [58]. Recall that a Merkle tree is a binary tree in which each node is the hash of the concate- nation of its children nodes. In general, the leaves of a Merkle tree will collectively contain some data of interest, and the root is a single hash value

which acts as a certificate commitment for the leaf values in the following sense. If one knows only the root of a Merkle tree and wants to confirm that some data x sits at one of the leaves, the holder of the original data can provide a path from the root to the leaf containing x together with the children of each node traversed in the Merkle tree. Such a path is difficult to fake because one needs to know the child preimages for each hash in the path, so with high likelihood the data holder will supply a correct path if and only if x actually sits at the designated leaf.

For clarity, we will present the verification game in the familiar and stan- dard language of Turing machines, but in actuality TrueBit uses the Google Lanai architecture to simulate steps in the computation (see Section6). The verification game consists of three parties:

  • a Solver who has allegedly performed some computational task and claims some solution,
  • a Challenger who disputes the correctness of the Solver’s solution, and
  • Judges with bounded computational power, who will rule on whether the output is correct.

TrueBit fills the roles of Solver and Challenger according to the incentive layer discussed in Section4. The Judges are the set of miners in Ethereum who collectively record their decisions on the blockchain. The incentive layer (Section4) determines the financial consequences of the Judges’ verdict.

To start, the Solver and Challenger each privately compile a tableau of Turing configurations mapping each time step of the task to its full internal state. Assume that the task runs in time t steps and space s, where s bits suffice to include the complete state of the Turing machine at any given time, including tape contents, head position, and machine state. The game protocol includes a fixed, integer parameter c > 1 which says how many configurations the Solver broadcasts to the blockchain in each round of the verification game. More configurations per round mean fewer total rounds needed, however it also means that the total number of configurations sent to the blockchain increases. In particular, we reach diminishing returns as the data sent per round reaches Ethereum’s per block capacity.

The verification game protocol proceeds as follows.

Main loop. The following steps are done iteratively to pinpoint the source of disagreement. The protocol determines timeout periods within which Solvers and Challengers must respond. Failure to respond within time bounds results in immediate loss for the non-responding party.

      1. The Solver selects c configurations equally spaced apart in time across the current range of the dispute. In the initial iteration, for example, the Solver selects configurations across the entire tableau at the following time steps:

t 2t

, ,

c c

3t ct

, . . . , .

c c

He then produces c Merkle trees, each with s leaves, where the leaves constitute the complete machine state at each of these times, and broadcasts each of the roots of these Merkle trees to the blockchain.

      1. The Challenger responds with a number i c, indicating the first time step in this list that differs from her own, and broadcasts this number i to the blockchain.
      2. The Judges check that the Solver indeed posted c Merkle roots to the blockchain and that the Challenger’s value i satisfies 1 ≤ i c. If either of these checks fails, the Solver, or respectively the Challenger, loses immediately.
      3. The next iteration of the game continues similarly, but restricted to configurations between the i − 1-st and i-th indexed configu- rations. Here we interpret a 0 as the computation’s initial con- figuration.

Final stage. After log t/ log c rounds, the loop above converges to the first, disputed computational step, and the Judges explicitly check this step of the computation as follows. Suppose that the first disagreement occurs at time e. The Solver provides paths from the Merkle root for time e to its leaves containing:

  • the location of the machine head at time e,
  • the content of the tape square under the machine head, the tape contents of its left and right neighbors, and
  • the machine state at time e.

The Solver also provides the analogous paths for time e−1. The Solver loses the dispute if the Judges find that any of these paths fail to be valid. Finally, using this information and the original task code (which existed on the blockchain prior to the verification game), the Judges check the computational step at time e and rule whether the challenge was justified.

Runtime and security analysis

We now show that the work done by the Judges is small compared to the work required to perform the task, and that the number of interaction rounds is modest. Note that the majority of effort from the Judges simply involves recording data onto the blockchain. The Judges’ work is split over many transactions and therefore avoids any chance of a Verifier’s Dilemma.

We fix σ as the size of each hash in the Merkle trees, and p as the space required to store a single machine state for the computation. For computational tasks that run in time t and space s with c configuration roots posted per round, the protocol requires log t/ log c rounds of interaction on the main loop, writes + log c bits to the blockchain during each iteration of the main loop, and writes 2 · ((2σ + 1) log s + 5 + p) bits in the final round (assuming that tape squares can contain either 0, 1, or be blank, so we need 3 · log 3/ log 2 ≈ 5 bits to describe 3 of them). Summing these parts, the total time required for community-wide verification from the Judges is then

Σ Σ· ·

O log t (+ log c) + 2 ((2σ + 1) log s + 5 + p) log c

where the hidden constant depends on the implementation of this protocol on the Judges’ local machines. This estimate does not include the space and time required to store the instructions for the initial computation task. We assume that this data already exists on the blockchain before the verification game begins.

In comparison with other proposed blockchain-based computation sys- tems, or even PCP-based verifiable computing systems [65], the work im- posed on the Judges by the verification game is small [12], as is the compu- tational overhead required to join the system [63]. Moreover, the economic costs to the Task Giver, who pays only a Solver and a Challenger and not all Judges to process the full task, is modest, as are the computational over- heads for the Solver and Challenger.

Note that larger values for σ reduce the chances of hash collisions. Hash collisions might allow a dishonest Solver to substitute configurations in his computation tableau with fake ones sharing the same Merkle roots. Two consecutive configurations in the tableau whose Merkle roots happen to agree with the Merkle roots of two consecutive configurations that the dis- honest Solver wishes to apply as substitute suffice to disrupt the integrity of the verification game protocol. Indeed, the Challenger will agree with both Merkle roots, and the Judges will confirm that the transition between them is valid and therefore incorrectly reject the challenge. For a fixed size

σ, the chance that such collisions occurs by accident even becomes likely for sufficiently enormous tasks.

One could entirely eliminate the security risk discussed in the previous paragraph by posting complete machine states on the blockchain rather than just Merkle roots, but this makes the protocol more expensive. Alternatively, one could either increase the parameter σ or the number of states checked by the Judges in the final step. In this sense, the choice of σ bounds the maximum complexity of secure computations in TrueBit. We will also see in Section4.1that the capital held by the TrueBit contract as a “jackpot” for finding a forced error poses an additional constraint on computational ca- pacity, but this value can scale as well. Finally, the effectiveness of TrueBit’s verification game may degrade for extremely complex tasks due to the com- putational limitations of the Challenger. If the expected execution time of a verification game exceeds a human life expectancy, for example, then potential Challengers may lack motivation to participate despite financial incentives. Note that even when a task itself is feasible, its correspond- ing verification game may not be as the verification game carries significant overhead. We anticipate that future versions of TrueBit may optimize the verification game for certain kinds of tasks so as to reduce this discrepancy (see Section7.4).

Remark. The verification game provides a significant privacy-protecting ef- fect in that only disputed parts of a given computation must touch the public eye [44].

Incentive layer

We now discuss the financial incentives which encourage anonymous par- ties to both contribute CPU cycles and perform requested computational tasks correctly. We use the dispute resolution layer from Section3as a subroutine and connect its role to the present construction. A Verifier is a potential Challenger, that is, someone who checks submitted solutions and calls for a challenge (see Section3) if and when she detects an error. By assumption (ii) in Section2.2, Verifiers must receive adequate payment for verification tasks. As argued in the beginning of Section2, however, Veri- fiers can only receive rewards when they actually find bugs. It follows that the reward for discovering errors must encompass amortized payment for all verification tasks checked, including those in which no bugs were found.

Any Ethereum user who offers a reward through TrueBit for performing a computational task is called a Task Giver. A party which offers a solu-

tion for performing these tasks in exchange for a reward is called a Solver, and, as indicated in the previous paragraph, Verifiers check that Solvers’ solutions are correct. Solvers and Verifiers will play the roles of “Solvers” and “Challengers” respectively from the dispute resolution layer (Section3) whenever a Verifier initiates a verification game. Referees enforce the rules of the TrueBit protocol in the incentive layer. While Ethereum miners enforce protocol rules both through the roles of “Judges” in Section3and the role of “Referees” in the present section, we use distinct labels for them because the layers and functions differ for these roles. The primary role of Judges is to interactively resolve a dispute, whereas Referees primarily enforce that Solvers and Verifiers timely submit appropriate data in the incentive layer.

In a nutshell. The main steps of a TrueBit contract are as follows. Items in quotes will be throughly explained in due course.

1.A Task Giver announces a task and offers a reward for its solution.

    1. A Solver is elected by lottery, and he prepares both a “correct” and an “incorrect” solution to the task.
      1. If a “forced error” is in effect, the Solver reveals the incorrect solution on the blockchain.
      2. Otherwise the Solver reveals the correct one.
    2. Verifiers check the Solver’s solution. They win a large “jackpot” pay- out if both:

(a)they correctly identify the solution as erroneous, and (b)a forced error was in effect.

    1. If no Verifier signals an error, then the system accepts the solution. Otherwise acceptance depends on the outcome of a verification game.

In the rest of this section, we discuss the security measures in TrueBit which ensure that participants comply with the skeletal procedure outlined above. A more complete description of the protocol appears in Section4.6.

Jackpots

TrueBit periodically imposes forced errors in which a Solver must offer a wrong solution to a task (see Section2). This ensures that Verifiers who diligently search for errors will eventually find them. Accordingly, Verifiers

who correctly report forced errors receive substantial jackpot payouts. By design, Verifiers cannot predict when forced errors will occur and therefore have incentive to check all tasks carefully. Forced errors occur only rarely, and we expect Solvers to make only few, if any, other errors. Rewards for identifying unforced errors may be modest compared to jackpot payouts, and we consider any such payments of these type incidental, rather than fundamental, to TrueBit’s secure incentive structure.

The jackpot payout effectively bounds the complexity of computation tasks that TrueBit can perform securely. The Verifier must, on average, re- ceive a payment which fairly compensates her for the task at hand, which means that the jackpot payout should at least consist of fair compensation for the current task times the forced error rate. In this way, the jackpot cap bounds the Verifier’s expected per task compensation which, by assump- tion (ii) in Section2.2, restricts Verifiers’ available CPU cycles per task.

We fix a rate for forced errors among tasks. This fraction should not be so low so as to discourage Verifier participation through excessively infrequent rewards, but neither should it be so high so as to run the risk of Referees bias (see Section5.1). We set forced errors to occur, on average, once every thousand tasks.

Taxes

Given that a jackpot repository must exist, we now describe the mechanism for funding it. We assume that a generous philanthropist deposits some initial funds into the repository, but thereafter TrueBit will be self-sustaining through taxes. Any Task Giver that calls a TrueBit contract must pay not only the cost of computational work done by the Solver but also for the work done by the Verifier(s) (excluding unforced errors and bogus challenges), as well as the work done by Referees and Judges. We refer to the latter two costs as the verification tax. We ensure that the jackpot repository never disappears entirely by placing a cap on the jackpot size. To this end, we set the maximum jackpot payout for a forced error to be one third of the total repository size.

While a single, attentive Verifier suffices to ensure correctness and ach- ieves ideal tax efficiency, in practice the verification tax requires a substantial cushion. We estimate the necessary verification tax to be 500% – 5000% of the cost of performing the given task. As we shall see in Section5.2, there is a quantitative tradeoff between tax efficiency, Solver deposits, and security of computations, so Solvers could potentially absorb some of this tax burden by contributing higher deposits. Our tax rate estimate incorporates Veri-

fiers’ incentives for participation and the fact that both the Task Giver and the Solver for a given task each have incentive to perform verification. Par- ticipation from these parties may necessitate higher taxes because the total jackpot payoff decreases exponentially as the number of challenges increases (see Section5.3). Expected jackpot payoffs must be sufficiently high to con- sistently attract at least one Verifier per task. The required tax amounts may also depend on peculiarities of human behavior. Indeed, we may have to pay Verifiers more than Solvers per CPU cycle because Verifier rewards have the human-undesirable property of being sporadic whereas Solvers al- ways receive immediate rewards. Thus the optimal tax rate must, at least in part, be determined experimentally.

Deposits

TrueBit requires deposits from Solvers and Verifiers in order to thwart Sybil attacks (see Section5.1) and incentivize correct computations. We set these deposits to be more than the expected jackpot payout for a given task plus the cost of playing a verification game. In particular, the deposits must be large enough to:

      1. pay for the (expensive) cost of a verification game, including all re- wards and penalties for Solver and Challengers and work done by Judges,
      2. discourage Solvers and Verifiers from sacrificing deposits in order to obtain jackpots without performing verification (see Section5.1and Section5.3),
      3. discourage Task Givers who collude with Solvers in effort to get bogus solutions onto the blockchain (see Section5.2),
      4. refund taxes to the Task Giver in case the Solver causes an unforced error,
      5. deter Solvers from randomly guessing solutions to obtain task rewards instead actually performing computations (as might be profitable for binary decision tasks with very high task rewards), and
      6. deter external, temporal pathologies.

Note that the currency used to pay Solvers and Verifiers need not be the same as the currency used to pay Judges and Referees, but for simplicity of presentation, we fix ether (ETH) as the unique underlying currency.

As an example of the second type, consider a situation where the Solver deposit is small (say 10 ETH) but the expected jackpot payout per task is high (say 1000 ETH). An individual playing both the role of the Solver and Verifier could offer a bogus solution and then challenge his own answer, hypothetically netting, on average, 1000 − 10 = 990 ETH without providing any useful service. Such an action would degrade other Verifiers’ incentive to participate.

As an example of the last case, if the outcome of a time-sensitive TrueBit contract controlled a 1000 ETH payout but only required a 10 ETH Solver deposit, then the party who stands to lose 1000 ETH from the TrueBit contract could attempt to cause a delay by posing as a Solver to the TrueBit contract and giving a bogus answer. We leave it to the Task Giver to determine appropriate minimum deposit values for such specific situations, as such contextual determinations lie outside of the scope of TrueBit itself.

Generating forced errors

In order to motivate verification of all tasks and to guard the jackpot repos- itory against swindle, forced errors must appear unpredictably. TrueBit uses strings of random bits to determine whether or not a forced error occurs for a given task. The system derives its unpredictability via the following properties.

      1. The Task Giver does not know the random bits at the time that she announces a task.
      2. The Solver does not know the random bits until after he has committed his solution.
      3. Verifiers do not know the random bits until after they decide whether to challenge.

The first property makes it difficult for a Task Giver to create a task designed to swindle the jackpot repository, and the second discourages Solvers from being lazy and only volunteering to solve tasks which have forced errors. In order to satisfy these three properties, TrueBit combines random bits from the following two sources:

(a)“private” random bits from the Solver, and

(b)the hash of the block mined immediately after the block containing the Solver’s solution.

By hash of the block, or block hash, we more precisely mean the hash of the block’s header, a roughly 200-byte piece of data that contains the times- tamp, nonce, previous block hash, and Merkle root hash of the transactions occurring in the block [32]. Source (b) achieves properties 1. and 2. above, and source (a) achieves property 3.

TrueBit generates Solvers’ “private” random bits using a method reminis- cent of RANDAO’s random generator mechanism [25]. Before participating in a TrueBit contract, a Solver must privately generate a string of random bits r and publish its hash on the blockchain. This action commits the Solver to using r in the protocol without revealing the value r to others.

The Solver establishes property 2. above by committing both a hash of a “correct” solution and a hash of an “incorrect” solution prior to the broadcast of the block hash from item (b). At the time that this block hash is revealed, only the Solver, who has access to the value r, knows whether or not a forced error is in effect. He publicly designates one of his two hashed solutions for evaluation, however potential Verifiers do not know a prioiri whether the Solver intended his solution to be “correct” or “incorrect.” Only after the timeout for challenges has elapsed do the Verifiers witness the Solver’s private random bits in the clear and learn whether a forced error was in effect. In case the Solver’s solution is challenged, the Solver must reveal r as well as his second “correct” solution in order to prove that the error was forced and avoid a penalty (see Section5.4). Thus, in the case of a forced error, the Task Giver still obtains a correct solution. In case of an unforced error, i.e. when the Solver evidently fails to provide a correct solution, the Task Giver receives a full refund of her task reward and taxes (see Section4.3and Section4.6).

Although in theory one could securely run this part of the protocol with- out hashing the Solver’s solutions, hashing makes it easier for the Solver to provide a convincing “incorrect” solution which appears, upon casual in- spection, indistinguishable from a “correct” one. The Solver can effectively use any randomly selected “incorrect” solution during a forced error because he never has to reveal its preimage.

Solver and Verifier election

The system effectively chooses Solvers by lottery. When a task is announced, Solvers broadcast their interest in solving it to the Referees in the form of Ethereum transactions. Referees, or more specifically miners, choose one of these transactions to include the next block, thereby electing the Solver for that given task. In case the miner who chooses the transactions for the

current block happens to also hold a lottery ticket, he may bias his chances of winning the lottery. This bias does not affect the security of TrueBit, however, since the Solver must still provide a correct solution.

In contrast to the Solver selection, TrueBit does not impose a limit on the number of Verifiers, and we describe in Section5.3how multiple Verifiers who participate in a task must split their rewards after a successful challenge. Due to the computational work involved in verification, rational Verifiers will only verify tasks in case they expect to receive adequate compensation for their work (assumption (ii), Section2.2). Thus the number of Verifiers verifying each task will remain low due to the balance between computation costs and incentives (see Section5.2and Section5.3).

Protocol overview

In this section we present an overview of the TrueBit protocol. We will discuss how our chosen parameters enable system security in Section5. Throughout the description below, Solvers and Verifiers who deviate from protocol by broadcasting malformed data or failing to respond within time- out bounds forfeit their security deposits to the jackpot repository.

Preprocessing steps. The following must be done prior to commencing

TrueBit operation:

1.A substantial jackpot repository must be established for the TrueBit contract prior to commencement of any interactions (See Section4.1for details).

  1. Solvers and Verifiers that wish to participate must commit de- posits to the TrueBit smart contract (See Section4.3). The de- posits may be placed at any time during the protocol so long as the underlying contract has sufficient time to confirm receipt of the funds.
  2. Solvers must also generate private random bits and commit their respective hashes to the blockchain. We denote the private ran- dom bits of the unique Solver selected to perform the task below by r (see Section4.4for more details).
  3. A universal tax rate T must be established (see Section4.2).

Main algorithm. The protocol steps are as follows.

1.A Task Giver provides the following: (a)a computational task,

(b)the timeOut for accepting bids, performing the computation, and waiting for a challenge. In the protocol below, we do not distinguish between these various timeouts with distinct no- tation, and we colloquially use “timeOut” to refer to both events and lengths of time. In all cases, timeOut must be long enough to avoid microforks during which Referees tem- porarily disagree on current parameters.

(c)a reward for a correct output, which must be at least the cash equivalent of the task difficulty d based on timeOut (see Section5.5), plus a total tax of T · d. The reward is held in escrow by the TrueBit contract while the taxes are immediately deposited into the jackpot repository.

(d)the minimum deposit, minDeposit, needed to participate as a Solver or Verifier (see Section4.3, Section5.1, Section5.2, and Section5.3).

  1. Solvers who have the requisite minDeposit and random bits can bid to take on the task until the bidding timeOut. At most one Solver is selected (Section4.5), and if no Solver takes on the task in the allotted time, the task request is canceled and the Task Giver receives a full refund.
  2. The Solver privately computes task. In case of a timeOut, Solver forfeits his deposit to the jackpot repository and the protocol terminates.
    1. Solver commits two distinct hashes to the blockchain, thereby committing both a “correct” and an “incorrect” solution.
    2. The hash of the next mined block is revealed, and then Solver

knows whether or not there is a forced error (see Section4.4).

    1. Solver designates one of the two hashes as the hash of his

solution.

  1. Verifiers who have posted minDeposit can challenge (the hash of) solution until timeOut. Prior to timeOut, the Verifier must broadcast the hash of an even integer to the blockchain in order to commit to a challenge. Hashing an odd number in case of no challenge is optional and may be used to camouflage real chal- lenges from other Verifiers (see Section5.3). After timeOut, the Verifier broadcasts to the blockchain this hashed number in the clear to reveal her action.
    1. If no Verifier challenges solution, then
      1. Solver reveals r to prove that there was no forced error (i.e. the criteria in Step 4(b)i below fails),
      2. Solver reveals solution in the clear on the blockchain,
      3. Solver receives the task reward, and then iv.the protocol terminates.
    2. Otherwise, some Verifier challenges solution.
      1. Solver reveals his private random string r on the block- chain, and Referees check it against his commitment from preprocessing step 3. If the hash of the concatenation of r and the block hash following the solution announce- ment from 3(b) is small (as determined by the forced er- ror rate, see Section4.1), then a forced error is in effect (see Section4.4).
      2. If the value r reveals that Solver’s error was forced, then Solver must reveal his secondary solution in the clear (see Section5.4).
        1. If no Verifer challenges Solver’s secondary solution so- lution before timeOut, then Verifier wins a fraction the maximum jackpot amount J , scaled for task dif- ficulty. In case of challenges from multiple Verifiers, the jackpot is split among them. In more detail, if there are k distinct challenges, then each participating Verifier receives J/2k−1 times the ratio of the task’s difficulty to the system’s maximum task difficulty (See Sections4.1,5.3, and5.4for further discussion).

B.Otherwise the Solver must play the verification game with the challenging Verifier(s). Verifier penalties, Ver- ifier rewards, Solver penalties, and refunds to the Task Giver here are the same as described in the next step. In case the Solver successfully defends himself against all challenges, however, then the jackpot payouts pro- ceed as in Step 4(b)i above.

      1. Otherwise the error was not forced. Solver reveals solution in the clear, and then Solver and Verifier must play a verification game (Section3.2). In case of chal- lenges from multiple Verifiers, the steps below are re- peated until either some Verifier wins a challenge or Solver defeats all Verifier challenges. The Solver collects reward only once he wins all of the challenges, and if this does

not happen, then the Task Giver receives a refund on his reward, and tax payments are reimbursed through Solver’s deposit as described below.

        1. If Solver wins, then Verifier forfeits half of her deposit to the jackpot repository and the other half to the Solver, where the latter suffices to compensate the Solver for playing the verification game (see Section4.3).

B.Otherwise Verifier wins. Solver pays at most half of his deposit to the Verifier(s), according to the distribution scheme in Step 4(b)ii above, pays back the Task Giver’s taxes out of this amount, and forfeits the remaining funds to the jackpot repository (see Section5.1and Section5.2).

End of protocol.

Note that Solver does not receive reward when a forced error occurs in Step 4(b)ii. In such situations, we expect the Solver to challenge his own “mistake” and receive a jackpot payout much greater than reward. TrueBit makes this payout automatic since the Solver would lose the verification game anyway if he and Verifier were to play it out.

Sanity check

The table on the next page recaps the parameters used in the TrueBit pro- tocol and hints at how to estimate their exact values. Each parameter is either a fixed constant, an input from a participant or something already on the blockchain, or can be computed from the elements which appear above it in the table. By inspecting the list below from top to bottom, one can confirm that all notions are well-defined.

Defenses

We now analyze the security of the incentive layer from Section4. TrueBit’s security relies on the presence of at least one Verifier to check each task performed. In this section we show that available incentives suffice to guar- antee the existence of such Verifier(s) according to the network assumptions in Section2.2and the attacker model from Section2.3. TrueBit defends against shortcuts which could divert verification incentives to non-verifying parties. In particular, we show that TrueBit resists Sybil attacks, collusion

parameter dependency
dispute layer parameters

p, c, and σ (Section3) tax rate

(Section4.2, Section5.2) forced error rate

(Section4.1)

maximum jackpot payout (Section4.2)

cash equivalent of CPU cycles (Section5.5)

maximum task difficulty

(Section4.1)

  • fixed constants.
  • fixed constant (500% – 5000%).
  • fixed constant (1/1000).
  • 1/3 of current jackpot repository.
  • based on external markets.
  • maximum jackpot payout divided by cash equivalent of a CPU cycle.
task

timeouts (Section4.6) task difficulty

Solver reward (Section5.5)

expected jackpot payout (Section4.1, Section5.2,

Section5.4)

Solver & Verifier deposits (Section4.3, Section5.1,

Section5.2, Section5.3)

Parameters in this box are chosen by the Task Giver with minimums as de- scribed below:

  • long enough to avoid microforks.
  • determined by timeouts.
  • no less than the cash equivalent of the task difficulty.
  • cash equivalent of task difficulty, and number of active Verifier deposits.
  • more than the cost of verification game plus the expected jackpot pay- out. Also depends on tax rate.
actual number of Verifiers (Section4.5)

jackpot payout for challenging a forced error (Section5.3)

Payout for detecting an unforced error (Section5.1, Section5.2)

  • as many join (incentives limit over- participation).
  • based on maximum jackpot payout, actual number of verifiers, and ratio of task difficulty to maximum task diffi- culty.
  • at most half of Solver deposit is split

among all Verifiers (rest goes to jackpot repository).

Table: Relations between parameters used in TrueBit.

pools, opportunistic attacks related to jackpot payoffs, and certain external threats.

Pairwise Sybil attacks

In a Sybil attack, an adversary assumes multiple identities on the network in order to execute an exploit. Identities on TrueBit include Task Givers, Solvers, Verifiers, Judges, and Referees. By our assumption (Section2.3), Judges and Referees always function as intended, and we provide additional justification for this axiom here. In this subsection, we consider all sets of pairs among the remaining identity types and show that pairwise coopera- tion does not harm operation of the system. While parties can freely join and leave TrueBit, each identity must make a deposit in order to participate. This deposit alone is a general deterrent against Sybil attacks, however as we shall see, multiple identities do not provide much cheating advantage.

Judges and Referees. Recall that Ethereum miners play the roles of Judges and Referees in TrueBit. Our analyses in Section3.4and Section4.6 show that the work done by these parties is small, and hence they are not vulnerable to a Verifier’s Dilemma (Section3.1). Nakamoto consensus [56] therefore ensures that miners will not post bogus transactions, i.e. enforce rules incorrectly, lest other miners reject their block and cost them a block reward. Therefore the only threat from miners is block withholding, specif- ically with regard to random number generator bias (see Section4.4).

While in theory miners could discard blocks, along with their mining reward and associated random bits for TrueBit, in practice this may never happen. A miner who drops a block must expect in exchange some income greater than the usual mining reward in order to make this action worth- while. If such income were to come as a bribe from a TrueBit participant, it would have to come from a Solver since only the Solver, who has unique access to his own private random bits, knows how to bias miners’ random bits towards or away from a forced error. In short, miners cannot disturb randomness in TrueBit without a tip-off from a Solver. The Solver, however, has no reason to bias against a forced error because this would prevent him from challenging his own answer and winning the jackpot, or at the very least would invite others to share it with him, thereby decreasing the total jackpot payout (see Section5.3). Moreover, the Solver is unlikely to suc- ceed in biasing towards a forced error since miners have little control over their own block hashes. This “one-sided” effect of block withholding, which can lock away jackpot funds but (almost) never release them, makes block

hashes a safe source of randomness in TrueBit. Hence the Solver’s potential reward for challenging his own solution under a forced error is not merely an artifact of TrueBit’s incentive structure — it guarantees unbiased Referees.

Task Giver and Solver. A Task Giver who also acts as a Solver does not benefit from solving her own task. One idea for an attack vector would be to create a task such that the Solver’s private random bits force an error for the given task. Since the Task Giver cannot predict the random bits from the block hash at the time the task is created (Section4.4, part (b)), the Task Giver cannot create such a task. Moreover, the Task Giver cannot afford to flood TrueBit with useless tasks and solve them herself in the hopes of eventually, by chance, encountering a forced error. The Task Giver would pay more taxes to the jackpot repository than she would expect to win from the jackpot payout (Section4.2), even taking into consideration any rewards she might win by correctly solving her own tasks. Her losses are amplified through payments to other Solvers, Verifiers, and Judges who choose to participate in the protocol.

Solver and Verifter. The Solver’s burned deposit always exceeds the Ver- ifier’s income for successfully challenging an unforced error (see Section4.3). Hence the Solver has no incentive to challenge himself as a Verifier in such cases. Similarly, a Verifier can only lose her deposit by posing bogus chal- lenges. In certain situations it is conceivable that a Solver–Verifier pair could benefit from submitting a false solution and then challenging it due to temporal constraints external to TrueBit itself (as mentioned in Section4.3), and in such cases the Task Giver must determine the necessary deposits to deter such actions.

In the case of an forced error, we expect the Solver will challenge himself in order to win the jackpot. Nevertheless, TrueBit’s tax rate (Section4.2), and hence its jackpot payout (Section4.1), suffices to incentivize an inde- pendent Verifier to also check the solution. As we shall see in Section5.3, the Solver lacks incentive to claim the same jackpot more than once, hence the Solver’s self-verification does not spoil other Verifiers’ motivation to par- ticipate.

Task Giver and Verifter. A Task giver can certainly verify the solution she receives. This checking can only improve the reliability of the system! We cannot, however, assume that the Task Giver will always perform such checks due to the Task Giver’s possible computational and financial resource

constraints or lack of motivation to involve herself in the details of a TrueBit

contract.

The trifecta

Up until this point, we have implicitly assumed that the attacker’s goal was to extract money from the jackpot without performing verification. How- ever there is another possibility: the attacker wishes to get a bogus solution accepted onto the blockchain. No penalty scheme can deter all such attacks since TrueBit, as a closed system, has no way to estimate the true economic impact of a bogus solution on Ethereum’s blockchain. Hence we now con- sider scenarios in which a party is willing to sacrifice a jackpot payout (or more) in order to skew computational results.

Scaring off Verifters. Suppose that a Task Giver wishes to obtain an incorrect solution for a task. Posing as a Solver, she gives a wrong solution to her own task. Now the only thing that could stand in the way of her success would be a pesky Verifier. In order to dodge this obstacle, the attacker poses as a regular Verifier prior to posting her malicious task. She checks every task that comes along, until eventually he encounters a forced error. At this point, the attacker challenges not once but a hundred times, which according to Step 4(b)ii in Section4.6essentially wipes out the jackpot payout for herself and any other Verifiers that also challenged this solution. The upshot is that legitimate Verifiers no longer wish to participate because TrueBit failed to deliver their well-deserved jackpot payout, and more to the point, they have no way to estimate their income from future jackpots. So when the attacker finally broadcasts her malicious task, legitimate Verifiers simply ignore it.

As Verifier participation, and hence security, depends entirely on the expected value of jackpot payouts (assumption (ii) in Section2.2), TrueBit must ensure sufficient expected value for these payouts. We therefore offer the following default strategy for all Verifiers. New tasks are submitted to TrueBit at various and overlapping times. A prospective Verifier who wishes to earn rewards for verification should count the number of active Verifier deposits and compare this to the average number of simultaneous tasks oc- curring on the network. If the tax rate and task flow profitably support an additional Verifier, then the prospective Verifier submits a deposit, ran- domly chooses an active task, and begins verifying. As soon as she finishes verifying this task, she randomly selects another active task and begins ver- ifying that next one. Thus a Verifier with either more powerful hardware or

better algorithms can gain a market advantage in TrueBit. Such a Verifier can check more solutions per unit time and therefore increase her expected jackpot rewards per unit time. Verifiers are free to leave the network at any time, for any reason. If the current number of active Verifiers exceeds the number supported by the tax rate and task flow, then the probabilis- tic default strategy tells each Verifier to exit the system with the minimal probability that one would expect to restore equilibrium if all Verifiers were following the default strategy.

Verifiers who choose tasks to verify at random according to the default strategy, as opposed to changing their behavior based on guesses as to who- solved-what before, can reasonably estimate their compensation for partici- pation. When a significant number of Verifiers (not necessarily a majority) follow the default strategy, the adversary wishing to establish a bogus so- lution can neither predict which tasks will be verified, nor can he influence which tasks default strategy Verifiers will verify. According to assump- tion (ii) in Section2.2, there exists some bound on the adversary’s financial means. If not, he could simply flood the Verifier pool with deposits and be- gin verifying, which in turn would incentivize legitimate Verifiers to leave the system. Therefore the adversary can only post a bogus solution and hope that, by chance, it gets ignored. In this way, the default Verifier strategy spoils the scare-off strategy above.

Let us now estimate the chances that the adversary succeeds to post a bogus solution which all legitimate Verifiers fail to notice. We make the worst-case assumption that the adversary Solver always gets elected to solve his own task. Suppose that the tax rate (see Section4.2) supports at least 6 Verifiers per task, and that the adversary holds at most 1/6 of all Verifier deposits. Then effectively there are an average of 5 verifiers per task. Sup- pose n is the average number of simultaneous tasks on the network. Then the number of Verifiers is approximately 5n, and so the chance that none of them check a particular solution is the chance that each of them inde- pendently and randomly choose other tasks, or [(n − 1)/n]5n. Therefore the probability that the adversary gets caught, regardless of n, is

1 − .1 −

1 5n

n

Σ

> 1 − e

5 > 99%.

The leftmost inequality follows from the standard limit definition for the Calculus constant e. We conclude that the adversary will most likely lose his Solver deposit by attempting such an attack.

By raising taxes further, we can make such an attack even more costly. In this sense the tax rate, together with the minimums for Solver deposits,

bounds the mandatory “honesty” of the network. If bogus solutions never occur, then there would be no need for taxes. On the other hand, higher taxes make it more expensive to cheat. There is a tradeoff between overhead tax expenses for Task Givers, minimum deposits for Solvers, and security of computations.

High-stake tasks. For some high-stakes applications it is possible to en- tirely fall back on the dispute resolution layer of Section3. In a high-stakes situation, it is reasonable to expect that other parties who have a significant financial stake in the outcome of a given task will make an effort to verify the solution regardless of expected jackpot payouts. This motivation acts as a secondary deterrent against the “Scaring off Verifiers” attack above, as does the fact that the Verifier receives a fraction of the Solver’s deposit in case of positive error detection (TrueBit burns part of the Solver’s deposit in order to avoid collusion between Solvers and Verifiers).

Collusion pools

In this section, we analyze the potential effects of pooling computational, informational, and financial resources among Verifiers and Solvers. Par- ticipants in TrueBit may voluntarily form a Cartel which may in turn use external smart contracts to enforce cooperation among its (mutually dis- trusting) members, however, Solver and Verifier deposits in TrueBit alone prevent many potentially harmful financial collusions designed to extract jackpot funds without exerting computational effort. Secondly, carefully chosen jackpot payout amounts, together with Solver and Verifier incentives and private commitment channels, prevent financial gain from unintended signaling of private random bits or challenges.

Rich but powerless. First, consider a Cartel of Solvers with limited CPU bandwidth but deep pockets which volunteers to solve each and every task that comes along. The Cartel makes no attempt to provide correct solutions to any of these tasks, but instead intends to absorb lost Solver deposits un- til a forced error comes along, at which point it splits the jackpot among its members. By construction (see Section4.3), Solver deposits exceed the expected jackpot payout per task. Therefore, in the long run, such a Cartel strategy loses money. If members of the Cartel instead decide to start pro- ducing correct solutions, this does not harm security because that’s what they are supposed to do anyway. We remark that, also by construction (Section4.4), a Solver cannot simply decide to accept only tasks which have

forced errors because the Solver does not know whether or not a forced error is in effect until after he has committed his solution to the blockchain.

Similarly, a Cartel which challenges every single task in the hopes of eventually obtaining a jackpot sustains losses in the long run due to lost Verifier deposits.

A flood of Challengers. Timely information about forced errors has eco- nomic value. While Solvers can earn jackpots by challenging their own forced errors, they risk sharing jackpot payouts by divulging information about such errors. It follows that Solvers have incentive to keep their knowledge of forced errors private and their fake solutions convincing. Information about active challenges also has economic value because a challenge posted to the blockchain could be a tip-off about a forced error. For this reason, Verifiers have incentive to announce “fake” challenges rather than remaining silent on tasks in which they don’t detect errors (See Step 5. in Section4.6). “Fake” challenges serve as noise to mask real challenges and protect Verifiers’ jack- pot revenue from potential copycats who might challenge just because they see someone else doing it.

If the jackpot for a given task were a fixed amount and equally divided among challenging participants, then one could potentially flood the set of participants with aliases to obtain more than one’s share of the jackpot. A Cartel in which members inform each other about forced errors could therefore have the economic means to flood the network with challenges and monopolize jackpot payouts. This in turn would dilute jackpots for legitimate Verifiers which would, in turn, degrade Verifier incentives for participation. While such a Cartel might successfully detect forced errors, it might also cause tasks with unforced errors to go unverified.

In order to dissuade a Cartel from flooding the network with harmful challenges, we must reduce the total jackpot payout based on the number of challenges that occur. The total jackpot paid out on a forced error with k challengers must not exceed

J

2k−1 . (5.1)

The reason is as follows. We want a Challenger’s net reward to decrease if she superfluously clones herself via the Cartel. Assume that there are n legitimate challenges plus a Cartel which initially contributes k challenges to the pool and is considering adding another alias. We want the per challenge jackpot distribution among n + k challenges to be less than it would be among n + k + 1 challenges regardless of k, and n. Let Ji denote the total

reward paid out when there are exactly i challenges. Then we want

k k + 1

n · Jn+k > n + 1 · Jn+k+1,

or

n + 1 k

Jn+k+1 < Jn+k · ·

n

>1 sfor˛a¸llxn≥0

.

1/2sfo˛r¸alxl k≥1

k + 1

Thus it suffices to set Jn+k+1 = Jn+k/2, or by induction Jk J1/2k−1. In fact we cannot do better than this because the case k = 1 asymptotically forces a matching bound.

The upper bound in (5.1) does not take into consideration the Verifier’s cost of playing the verification game. In the case where the Solver’s solution has been found to have an error through one of the challenges, there is no need to repeat the verification game with the other challenges. The Verifier who actually plays the verification game with the Solver receives a bonus from the Solver’s deposit to compensate her for the CPU cycles spent during the challenge.

On low-hanging fruit

We analyze the security implications of Verifiers and Solvers who oppor- tunistically disappear for hard tasks and reappear for easy ones, or who take advantage of solution discrepancies.

Easy winners. If the jackpot for all tasks were equal, rational Verifiers might choose to verify only simple tasks and ignore complex ones. For this reason, the jackpot for each task scales proportionally with the task’s complexity (Step 5. in Section4.6). Scaling ensures that a Verifier’s expected jackpot payout per CPU cycle remains constant across all tasks, and it equally incentivizes Verifiers to inspect simple and complex tasks. Solvers always appear to perform tasks since the minimum task reward suffices to compensate them for their work (see Section4.6and assumption (ii) in Section2.2, ).

Multiple solvers. For security reasons, TrueBit explicitly does not allow Task Givers to hire redundant Solvers for a single task (Main Algorithm, Step 2., Section4.6). Suppose that two Solvers provided solutions to a single task and exactly one of them receives a forced error. Any Observer who notices a difference in the two solutions on the blockchain could, without

verifying anything, challenge both solutions. By playing two verifications and sacrificing one deposit, such an Observer could potentially win a jackpot at negligible cost, thereby degrading the overall incentives for verification.

Forced error in disguise. What happens if a Solver has a forced error but gives a correct solution anyway? The Solver could then challenge his own solution while other Verifiers ignore it, resulting in both the Solver receiving a bigger share of the jackpot as well as damage to the system’s verification mechanism. Therefore, when the Solver reveals that a forced error was in effect, the protocol also forces him to reveal his committed “correct” solution which must be distinct from the “incorrect” solution that he showed at first. If the second solution revealed by the Solver is correct and distinct from his first solution, then by uniqueness of deterministic processes, the first solution must have been incorrect (as desired). Verifiers have an opportunity to challenge this second solution. If an error is detected in it via a verification game, the Verifier(s) split the Solver’s deposit rather than the jackpot payout according to Step 4(b)ii in Section4.6. Since the Verifier(s) presumably already performed the task themselves when checking the first solution, no additional CPU cycles are required to do this second check. As the Solver loses both a jackpot payout and a deposit by following this course, by assumption (ii) in Section2.3, the situation described in this paragraph should never arise.

A cash equivalence problem

Consider the following scenario. A Task Giver wishes to get a bogus solution onto the blockchain. He offers a minimal reward for a difficult task so as to ensure that no legitimate Solvers or Verifiers volunteer to inspect it. Acting as a Solver, she then provides a bogus solution, and the system accepts her solution because no one bothers to check it. It follows that for security purposes, the system must require that Task Givers compensate fairly based on the difficulty of the task. But how can TrueBit estimate the true cost of executing a given task in its native currency?

While one could potentially establish a long-term lower bound on the cost of a CPU cycle relative to a stable currency like the US dollar or Euro, calculating a lower bound relative to the value of a cryptocurrency token is another matter. Cryptocurrencies are extremely volatile. Moreover, TrueBit lives on the blockchain and does not have access to a newspaper with current exchange rates.

In the first iteration of TrueBit, we will manually update the internal cash equivalent of a CPU cycle based on a live feed (e.g. [4,23,26]). Ultimately, however, we would like to input these prices in a decentralized way without relying on a third-party. Later versions of the protocol may make use of Augur [2], a decentralized prediction market which ports outside information sources onto the Ethereum blockchain. As of this writing, Augur is currently in beta testing. Alternatively, we may build an independent blockchain for reporting prices whose “transactions” consist of exchange rate updates.

Implementation

Formally, TrueBit is a smart contract in Ethereum which uses Ethereum’s ex- isting smart contract framework to bootstrap the creation of computationally- intensive TrueBit contracts. Tasks in TrueBit take the form of C, C++, or Rust code, but the user must pass this code as input to the Google Lanai architecture [15] (discussed below) prior to submission to TrueBit. This lat- ter step guarantees consistency of simulated architecture and allows Judges to adjudicate fairly.

Google’s Lanai interpreter. The theoretical framework for the verifi- cation game requires a fixed computer architecture to be used for all veri- fication tasks. In [40], the authors used the regular Intel X86 architecture for this purpose, but we believe that this architecture is far too complicated to be used in TrueBit. All participants of the verification game, including the Judges, have to simulate the behavior of the entire architecture. Even a slight difference in these implementations could result in catastrophic loss of funds for one of the parties. Moreover, simpler architecture costs less to sim- ulate on the blockchain, and because of its simplicity and the fact that there now exists an efficient and trusted LLVM compiler writer, we have chosen to use Google’s Lanai architecture [15]. The full Google Lanai interpreter will be available as a smart contract in Ethereum so that it can be used by Judges. In all TrueBit contracts, the Lanai bytecode will be authoritative.

TrueBit’s on-chain interpreter runs in Solidity. For efficiency reasons, tasks will not be run using the interpreters except in case of dispute. In general, users will run tasks written in native programming languages and running on regular hardware. In the majority of cases where tasks do not require the Lanai interpreter, Solvers and Verifiers can privately optimize implementation of task executions and potentially gain a market advantage over other participants.

Applications

TrueBit is more than just an outsourced computation system. It is designed for use in trustless smart contracts. We present some examples of possible use cases.

Practical decentralized pooled mining

Mining rewards are extremely competitive. A typical desktop computer might only mine on average one Bitcoin block every thousand years. To reduce income variance, miners often join mining pools which share both computing resources and mining rewards. Each pool has an operator who distributes computing tasks and rewards to pool members. This central- ization degrades the security of the system by giving the operator undue influence to censor transactions [51]. In extreme cases where an operator controls more than half of the network’s hash rate, as has been the case with DwarfPool [9] in Ethereum, GHash.io [30] in Bitcoin, and could hap- pen again with Bitmain’s proposed gigantic mining center [3], the operator can even withdraw cleared transactions and double-spend money by way of a 51% attack [48].

SmartPool [27] introduces mining pools whose operators are Ethereum smart contracts. As decentralized pool operators, smart contracts have many virtues. In addition to counteracting the censorship issues described in the previous paragraph, they can operate at low cost relative to centralized pools and do not rely on a social contract to ensure fairness. Unlike other decentralized mining pools, which either have higher payout variance [24] that negate incentives for joining the pool or require a change in the proof- of-work protocol [54], SmartPool offers low variance payouts and retrofits existing cryptocurrencies, and it can handle a huge number of participants with a wide range of computing power.

SmartPool’s proposed Ethereum mining pool minimizes verification work by precomputing the 1 GB data sets needed to check Ethereum’s proof-of- work [10,51]. While this shortcut may help in checking Ethereum’s proof-of- work, not all cryptocurrencies have this lucky feature. Fortunately TrueBit contracts can check any proof-of-work, which means that with TrueBit, we can build a smart contract-based mining pool for any Nakamoto consensus- based cryptocurrency. At the time of this writing, for example, the effort required to check a Zcash proof-of-work [33] appears to exceed Ethereum’s gasLimit capacity by a factor of 50. TrueBit is an option for bringing this task within reach of Ethereum smart contracts. Project Alchemy [17],

which aims to bring smart contract functionality to Zcash in the spirit of the Dogecoin–Ethereum bridge below, may also benefit from TrueBit’s ability to check Zcash’s proof-of-work.

Dogecoin–Ethereum bridge

We can use TrueBit to build a two-way peg between Dogecoin [6] and Ether- eum, enabling Dogecoin users to move dogecoins between Dogecoin’s block- chain and Ethereum’s blockchain without transmitting currency through a third-party exchange and effectively adding smart contract functionality to Dogecoin. The Dogecoin community maintains an active interest in such a bridge [8], and current offers a prize of more than 6000 ETH for its con- struction [5,7].

TrueBit contracts can check Dogecoin’s Scrypt-based proof-of-work whereas traditional Ethereum smart contracts cannot. If Dogecoin were to enhance its scripting language with an operation indicating currency transfer to Ethereum addresses, a TrueBit contract could then confirm such transfers. The newly created dogecoin token could then be passed around Ethereum, and a final signed transaction in Ethereum could finally send the dogecoin back onto Dogecoin’s blockchain, assuming that Dogecoin miners are willing to treat the Ethereum blockchain as authoritative for such transfers.

Scalable transaction throughput

Building a system that can securely meet even the modest transaction vol- ume demands of current Bitcoin users remains a formidable challenge [41]. Proposed Byzantine “sharding” [36,42,47,49,53,55,57], and miner-based “serializing” [43,60] protocols exist which aim to distribute verification work, but here we take a simple and fundamentally different approach to decouple the two tasks which miners perform, namely

      1. selecting which transactions to include in the blockchain, and 2.verifying that blockchain transactions are valid.

Using TrueBit, one can construct a verification protocol whose incentives guarantee that task 2 is correctly performed by off-chain Solvers and Verifiers (with some help from Judges and Referees), while miners continue to perform task 1. In this way, complex transactions can securely reach the blockchain without overburdening miners.

Towards a big data system

In order to perform as a truly scalable cloud computer, TrueBit must have access to a scalable data storage system. Ethereum’s blockchain alone does not suffice as storing even moderate amounts of data directly on Ethereum’s blockchain is prohibitively expensive. TrueBit can securely access and use portions of massive data sets so long as the data is stored somewhere pub- licly and permanently, for example in Swarm [28] or on another blockchain. Parties who wish to rely on such data in a TrueBit contract must be sure that Verifiers have access to the full data set.

To use TrueBit on external data, one need only store a Merkle root of the massive data set on the blockchain and add non-deterministic steps in the verification game in which the Solver can “guess” the original data set rep- resented by the Merkle root. While Solvers and Verifiers must have access to the full data, Judges and Referees do not. Indeed, if we modify the ver- ification game so as to permit tasks for nondeterministic Turing machines, then the Solver can nondeterministically guess the certificate data as a step in the TrueBit contract. Only in cases of disputes would the Solver have to reveal external certificate data to the Judges via the blockchain. In some applications, the Solver might even even be able to reveal to the Judges a privacy-preserving zkSNARK rather than the data itself. zkSNARKs have the potential to enable privacy for many different kinds of systems on Ether- eum [59].

While in theory TrueBit’s scalable protocol can process arbitrarily com- plex tasks, in practice the verification game is inefficient and therefore secu- rity of TrueBit computations degrades as tasks reach the level of big data. For big data applications, TrueBit may not be able to rely on a one-size-fits- all verification game. Therefore we anticipate optimizing the verification game for certain classes of tasks. For example, Section3.2gives an example of an efficient, specialized verification game for matrix multiplication. In future versions of TrueBit, Task Givers might broadcast not only tasks but also an indication of the corresponding verification game to be played in case of disputes.

Remark. We conclude with a caveat: TrueBit may expose latent security vulnerabilities in the underlying Ethereum network as a result of new kinds of interactions between smart contracts and miners. By allowing smart contracts to check proof-of-works, for example, TrueBit may facilitate 38.2% attacks [62].

Acknowledgments. We thank Vitalik Buterin and Vlad Zamfir for sug- gesting the use of forced errors in the TrueBit protocol and Eli Bendersky for introducing us to Google Lanai. We also thank Loi Luu and Julia Koch for useful discussions.

Addendum

In the months following the initial release of this whitepaper, new work and feedback informed TrueBit’s development roadmap and helped refine the protocol itself. We discuss a few recent developments.

Security patches

A TrueBit adversary has either one of two goals: 1.get bogus computations onto the blockchain, or

2.extract jackpot funds without performing verification.

We investigate three attacks of the first type followed by two of the second.

Premature disclosure of random bits (Zack Lawrence). A Solver can dissuade Verifier participation by publicly broadcasting his private random bits prior to the designated reveal time. Verifiers then know immediately whether or not a forced error is in effect. Since Verifiers expect to gain little from checking solutions without forced errors, they may decide not to verify, thereby offering opportunity to get bogus solutions onto the blockchain.

1protocol’s [1] random number generator protocol, Arbit, solves this problem by instituting penalties for Solvers who prematurely reveal private random bits and rewarding users who report them. When a user correctly reports a premature reveal to TrueBit, the following occurs.

  1. The Solver’s solutions are discarded, and a new Solver lottery takes place. This re-incentivizes Verifiers to participate in the task while voiding the Solver’s incentive to reveal private information.
  2. Half of the Solver’s deposit gets burned. This makes the above attack expensive for the Solver.
  3. The other half of the Solver’s deposit goes to the party who reported the Solver’s random bits. This incentivizes Observers to report the Solver’s prematurely revealed random bits.

Incorrect secondary solution (Sina Habibian and Harley Swick [22]). Suppose that a forced error is in effect and that the Solver submits two incorrect solutions. When the Solver reveals his “correct” secondary solution in Step 4(b)ii of the protocol (Section4.6), Verifiers ignore it because there’s no chance of a jackpot payout. Indeed, the only “reward” for correctly challenging this secondary solution is to play a verification game. Hence one of the Solver’s bogus solutions ends up on the blockchain.

We eliminate the incorrect secondary solution vulnerability as follows. Denote the Solver’s two solutions by A and B. In the beginning of Step 4, rather than signaling for a challenge with the hash of an even integer, the Verifier hashes an integer whose value mod 3 the protocol interprets as follows:

  1. mod 3: challenge solution A,
  2. mod 3: challenge solution B,
  3. mod 3: challenge both A and B.

The Verifiers indicate their choice without knowing which of the two solu- tions the Solver puts forth as an answer. The protocol hides this information from Verifiers via the following process. The Solver commits to either solu- tion A or solution B by hashing either A or B paired with his private random bits, where the Solver’s private random bits serve as “noise” which prevent Verifiers from guessing which option the Solver chose. The Solver has incen- tive not to share his private random bits due to the “Premature disclosure of random bits” patch above as well as the fact that the Solver risks reducing his jackpot share by exposing this private information. Finally, once the timeout for challenges has passed, the Solver reveals his random bits in the clear, thereby indicating his commitment to either solution A or solution

        1. Challenges and forced errors then proceed as usual. In case the protocol forces the Solver to reveal his second “correct” solution, Verifiers who earlier committed to a challenge against this second solution are obligated to play a verification game. In this way, Verifiers catch errant secondary solutions just as they catch errant primary ones.

In case a forced error is not in effect, broadcasting a pair of incorrect solutions poses a cost to the Solver in the form of a lost deposit. Indeed Verifiers have proper incentives to check the Solver’s primary answer. Since forced errors occur rarely and unpredictably, the Solver expects to sacrifice several deposits in order to mount an “incorrect secondary solution” attack. This loss offsets the Solver’s eventual jackpot gain from challenging his own forced error solution. We implicitly assume that the chance to win a jackpot sufficiently motivates a Verifier to challenge whenever a Solver submits a pair

of incorrect solutions; any Verifier who challenges both submitted solutions must play a verification game.

The fix above has a potentially useful side effect of publicly indicating how many Verifiers are monitoring a given task. Indeed, a Verifier broad- casts one of the three commitment forms above if and only if she is paying attention. The option to signal “no challenge” is no longer needed for cam- ouflage because broadcasting challenges no longer indicates presence of a forced error. Moreover, if the Solver were to submit two correct solutions, the smart contract could immediately recognize them as identical and pe- nalize the Solver accordingly.

An adversary could potentially exploit the monitoring feature in the pre- vious paragraph by broadcasting multiple challenge commitments from Sybil identities, thereby reducing the total payout in case of a forced error and discouraging other rational Verifiers from participating. For this reason, the protocol must prevent Verifiers from revealing which task they are challeng- ing until the final phase of the protocol. Since each Verifier performs the given computational task without distraction from others’ commitments, an adversary cannot deter Verifier participation via Sybil attack.

Program abort (Yaron Velner). A task must explicitly specify an upper bound on the number of steps for which the verification game can run, and the Solver’s solution should return an “error” if and only if the task computation exceeds this bound. Indeed, the computation steps for a given task must form a deterministic sequence in order for Judges to determine their correctness. Ideally, one would like to run tasks directly in a native language like C, C++, or Rust, however this approach requires accurate metering of the number of computation steps. We can reduce the metering overhead by processing steps with a compiler rather than an interpreter.

Jackpot balloon attack (Cl´ement Lesaege). In the “flood of challengers” attack (Section5.3), a Cartel could artificially inflate the jackpot repository by repeatedly challenging a single forced error solution and thereby reduce jackpot payout. Eventually, the Cartel could offset the cost of this attack by cashing in on an extra large jackpot. This action could recruit membership for the Cartel at low cost by removing incentives for Verifiers who do not belong to the Cartel.

We mitigate against the benefit of this attack by placing a fixed, absolute bound on the jackpot payout, regardless of the actual jackpot repository balance. Extra revenue accumulated in the jackpot repository only becomes

available for payout after a protocol upgrade. Note that the maximum jackpot amount does not necessarily pay out at each forced error; the actual payout depends on the difficulty of the task.

Incentivizing block withholding (Cl´ement Lesaege). The attacker, who is a Solver, deploys a smart contract which pays money to miners who with- hold (or intentionally uncle) blocks whose random seed fails to yield a forced error. See [64] for possible implementations. This attack is profitable for tasks in which the expected jackpot payout times the probability of a forced error exceeds a block reward. Through repetitions, this attack could drain the jackpot repository.

Arbit’s security mechanism, discussed in “Premature disclosure of ran- dom bits” above, defends against this attack. In order to execute the present “incentivizing block withholding” attack, the Solver either has to reveal his bits publicly or has to provide them to miners through private channels. If the Solver’s penalty for a premature reveal exceeds the expected jackpot payout times the probability of a forced error, then this attack results in an expected net loss.

The TrueBit Virtual Machine

Section3.3describes a verification game which operates with respect to a Turing machine. In practice, however, no compiler exists which can trans- form C++ code into something as simple as Turing machine language. In Section6, we proposed Google Lanai as a practical compromise in the ab- sence of Turing machine architecture. Due to Lanai’s complexity, the fact that Google controls its codebase, and that progress on its development ap- pears to have slowed, we have since migrated development away from Google Lanai.

In addition to executing exactly the same computation steps regardless of hardware configuration, the complier architecture underlying the verifica- tion game, or TrueBit Virtual Machine (TVM), must satisfy the following simplicity properties.

1.A single computation step on the TVM runs comfortably within Ether- eum’s gas limit, and

2.the space required to describe a TVM state change fits inside a single Ethereum transaction.

WebAssembly architecture [31] comes sufficiently close to matching these properties so as to make TVM execution practical today. WebAssembly has

become increasingly ready-to-use due to contributions and testing by Apple, Google, Microsoft, and Mozilla [45]. Several cryptocurrency projects have begun to develop on WebAssembly, including Dfinity, eWASM, and Parity, due to the platform’s machine independence and relative simplicity.

The TVM consists of two parts:

  1. an off-chain interpreter which enumerates a list of states for a given computation, and
  2. an on-chain stepper which, given a state, can compute the next state.

Solvers and Challengers use the interpreter to create a Merklized list of states for a computation. Once the Solver and Challenger have determined the first step at which they disagree, the Judges use the stepper to run this step and rule on the outcome of the verification game.

Since Native WebAssembly does not entirely satisfy the simplicity prop- erties above, the interpreter must either further compile input code into a simpler architecture, or it must divide each WebAssembly instruction into substeps. In order to reduce the chances of inducing compiler error, the TVM follows the latter strategy. The TVM’s off-chain interpreter parses WebAssembly instructions into OCaml execution which in turn creates reg- isters describing WebAssembly suboperations. Each Ocaml-based sub-step only accesses one dynamic data structure at a time.

Additional applications

Finally, we mention a few more illustrative applications.

Video broadcasting. Livepeer [18] offers a new platform for decentral- ized, live streaming video. Users can broadcast, watch, or get paid for performing the computationally intensive process of transcoding video into different codecs and formats. TrueBit ensures that transcoders perform this work correctly, while Swarm [28] guarantees that video data remains avail- able to TrueBit during the necessary verification period.

Autonomous machine learning. McConaghy’s ArtDAO [52] generates art, sells it, and then uses its revenue to improve its own code. A TrueBit- based ArtDAO would allow a financed piece of code on the blockchain to access computational resources in such a way that no one can “pull its plug.” We may eventually see other blockchain-based machine learning ap- plications, like computer vision, as well.

Data marketplace. Hedge fund Numerai [19] crowdsources modeling problems to data scientists and then executes trades based upon their work. Numerai rewards successful models, however contributing data scientists must trust Numerai to both test their work and compensate them fairly. TrueBit enables autonomous data markets. Open Mined [21], paired with TrueBit, opens the possibility of trustless renumeration based on stream- ing flows of models and data. The Ocean Protocol [20], which facilitates contribution and sharing of data, also requires a verification mechanism.

Staking and random numbers. 1protocol [1] allows users who have ei- ther computing resources or capital, but not necessarily both, to participate as TrueBit Solvers and Verifiers by decoupling security deposits from work done. In addition, 1protocol’s Arbit protocol uses interactive verification to generate random numbers in a decentralized way.

Other applications. Please check the TrueBit website for other current ideas in progress! https://truebit.io.

References

[1]1protocol. http://1protocol.com. [2]Augur. https://www.augur.net/.

[16]I thikn the attacker is this miner—today he made over $50k. https://www.reddit.com/r/ethereum/comments/55xh2w/i_thikn_ the_attacker_is_this_miner_today_he_made/.

  1. Introducing Project Alchemy. https://z.cash/blog/ project-alchemy.html.
  2. Livepeer. https://livepeer.org/. [19]Numerai. https://numer.ai.

[20]Ocean Protocol. https://oceanprotocol.com/. [21]Open Mined. https://openmined.org/.

  1. Open problems. https://github.com/TrueBitFoundation/ Developer-Resources/wiki/Open-Problems.
  2. Oraclize. http://www.oraclize.it/. [24]P2Pool. http://p2pool.org/.

[25]RANDAO. https://github.com/randao/randao. [26]Reality Keys. https://www.realitykeys.com/. [27]SmartPool. http://smartpool.io.

[28]Swarm. http://swarm-gateways.net/. [29]The DAO. http://daohub.org/.

  1. Warning: Ghash.io is nearing 51% – leave the pool. http://www.cryptocoinsnews.com/ warning-ghash-io-nearing-51-leave-pool/.
  2. WebAssembly. http://webassembly.org/.
  3. White paper. https://github.com/ethereum/wiki/wiki/ White-Paper.
  4. Why Equihash? https://z.cash/blog/why-equihash.html. [34]Zcash. https://z.cash/.
  5. Some miners generating invalid blocks. https://bitcoin.org/en/ alert/2015-07-04-spv-mining, July 2015.
  6. Ittai Abraham, Dahlia Malkhi, Kartik Nayak, Ling Ren, and Alexander Spiegelman. Solidus: An incentive-compatible cryptocurrency based on permissionless Byzantine consensus. https://arxiv.org/abs/1612. 02916, 2016.
  7. Eli Ben-Sasson, Alessandro Chiesa, Daniel Genkin, Eran Tromer, and Madars Virza. SNARKs for C: Verifying program executions succinctly and in zero knowledge. In Ran Canetti and Juan A. Garay, editors, Advances in Cryptology – CRYPTO 2013: 33rd Annual Cryptology Conference, Santa Barbara, CA, USA, August 18-22, 2013. Proceed- ings, Part II, pages 90–108, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg.
  8. Nir Bitansky, Ran Canetti, Omer Paneth, and Alon Rosen. Indistin- guishability obfuscation vs. auxiliary-input extractable functions: One must fall. https://eprint.iacr.org/2013/468.pdf.
  9. Ran Canetti, Ben Riva, and Guy N. Rothblum. Practical delegation of computation using multiple servers. In Proceedings of the 18th ACM Conference on Computer and Communications Security, CCS ’11, pages 445–454, New York, NY, USA, 2011. ACM.
  10. Ran Canetti, Ben Riva, and Guy N. Rothblum. Refereed delegation of computation. Information and Computation, 226:16 – 36, 2013. Special Issue: Information Security as a Resource.
  11. Kyle Croman, Christian Decker, Ittay Eyal, Adem Efe Gencer, Ari Juels, Ahmed Kosba, Andrew Miller, Prateek Saxena, Elaine Shi, Emin Gu¨n Sirer, Dawn Song, and Roger Wattenhofer. On scaling decen- tralized blockchains (a position paper). In Financial Cryptography and Data Security 2016 BITCOIN Workshop, volume 9604 of Lecture Notes in Computer Science, pages 106–125. Springer Berlin Heidelberg, February 2016.
  12. Christian Decker, Jochen Seidel, and Roger Wattenhofer. Bitcoin meets strong consistency. In Proceedings of the 17th International Conference on Distributed Computing and Networking, ICDCN ’16, pages 13:1– 13:10, New York, NY, USA, 2016. ACM.
  13. Ittay Eyal, Adem Efe Gencer, Emin Gun Sirer, and Robbert Van Re- nesse. Bitcoin-NG: A scalable blockchain protocol. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), pages 45–59, Santa Clara, CA, March 2016. USENIX Association.
  14. Tim Goddard. AdversariallyVerifiableMachine. https:

//www.reddit.com/r/ethereum/comments/51qjz6/interactive_ verification_of_c_programs/d7ey41n/.

  1. Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and JF Bastien. Bringing the web up to speed with WebAssembly. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, pages 185–200, New York, NY, USA, 2017. ACM.
  2. Sanjay Jain, Prateek Saxena, Frank Stephan, and Jason Teutsch. How to verify computation with a rational network. https://arxiv.org/ abs/1606.05917, June 2016.
  3. Eleftherios Kokoris Kogias, Philipp Jovanovic, Nicolas Gailly, Ismail Khoffi, Linus Gasser, and Bryan Ford. Enhancing Bitcoin security and performance with strong consistency via collective signing. In 25th USENIX Security Symposium (USENIX Security 16), pages 279–296, Austin, TX, 2016. USENIX Association.
  4. Joshua A. Kroll, Ian C. Davey, and Edward W. Felten. The economics of Bitcoin mining, or Bitcoin in the presence of adver- saries. http://www.econinfosec.org/archive/weis2013/papers/ KrollDaveyFeltenWEIS2013.pdf, June 2013.
  5. Loi Luu, Viswesh Narayanan, Chaodong Zheng, Kunal Baweja, Seth Gilbert, and Prateek Saxena. A secure sharding protocol for open blockchains. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS ’16, pages 17–30, New York, NY, USA, 2016. ACM.
  6. Loi Luu, Jason Teutsch, Raghav Kulkarni, and Prateek Saxena. Demys- tifying incentives in the consensus computer. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS 2015), pages 706–719, New York, NY, USA, 2015. ACM.
  7. Loi Luu, Yaron Welner, Jason Teutsch, and Prateek Saxena. Smart- Pool: Practical decentralized pooled mining. http://smartpool.io/ docs/smartpool.pdf.
  8. Trent McConaghy. Wild, wooly AI DAOs. https://blog.bigchaindb. com/wild-wooly-ai-daos-d1719e040956.
  9. Silvio Micali. ALGORAND: the efficient and democratic ledger. http:

//arxiv.org/abs/1607.01341, 2016.

  1. Andrew Miller, Ahmed Kosba, Jonathan Katz, and Elaine Shi. Nonout- sourceable scratch-off puzzles to discourage Bitcoin mining coalitions. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, CCS ’15, pages 680–691, New York, NY, USA, 2015. ACM.
  2. Andrew Miller, Yu Xia, Kyle Croman, Elaine Shi, and Dawn Song. The honey badger of BFT protocols. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS ’16, pages 31–42, New York, NY, USA, 2016. ACM.
  3. Satoshi Nakamoto. Bitcoin P2P e-cash paper. http://www. mail-archive.com/cryptography@metzdowd.com/msg09959.html, November 2008.
  4. Rafael Pass and Elaine Shi. Hybrid consensus: Efficient consensus in the permissionless model. https://eprint.iacr.org/2016/917.pdf, 2016.
  5. Christian Reitwiessner. From smart contracts to courts with not so smart judges. https://blog.ethereum.org/2016/02/17/ smart-contracts-courts-not-smart-judges/.
  6. Christian Reitwiessner. zkSNARKs in a nutshell. https://blog. ethereum.org/2016/12/05/zksnarks-in-a-nutshell/, Dececmber 2016.
  7. Yonatan Sompolinsky, Yoad Lewenberg, and Aviv Zohar. SPECTRE: A fast and scalable cryptocurrency protocol. https://eprint.iacr. org/2016/1159.pdf, 2016.
  8. Nick Szabo. The idea of smart contracts. http://szabo.best.vwh. net/smart_contracts_idea.html, 1997.
  9. Jason Teutsch, Sanjay Jain, and Prateek Saxena. When cryptocur- rencies mine their own business. In Financial Cryptography and Data Security: 20th International Conference (FC 2016) Christ Church, Bar- bados, pages 499–514. Springer Berlin / Heidelberg, 2017.
  10. Jelle van den Hooff, M. Frans Kaashoek, and Nickolai Zeldovich. Ver- sum: Verifiable computations over large public logs. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS ’14, pages 1304–1316, New York, NY, USA, 2014. ACM.
  11. Yaron Velner, Jason Teutsch, and Loi Luu. Smart contracts make Bit- coin mining pools vulnerable. To appear in 4th Workshop on Bitcoin and Blockchain Research (BITCOIN 2017).
  12. Michael Walfish and Andrew J. Blumberg. Verifying computations without reexecuting them. Communications of the ACM, 58(2):74–84, January 2015.

Monero

Highlight to Annotate. Click to Learn.

CryptoNote v 2.0

Nicolas van Saberhagen
October 17, 2013

Introduction

“Bitcoin” [1] has been a successful implementation of the concept of p2p electronic cash. Both professionals and the general public have come to appreciate the convenient combination of public transactions and proof-of-work as a trust model. Today, the user base of electronic cash is growing at a steady pace; customers are attracted to low fees and the anonymity provided by electronic cash and merchants value its predicted and decentralized emission. Bitcoin has effectively proved that electronic cash can be as simple as paper money and as convenient as credit cards.

Unfortunately, Bitcoin suffers from several deficiencies. For example, the system’s distributed nature is inflexible, preventing the implementation of new features until almost all of the net- work users update their clients. Some critical flaws that cannot be fixed rapidly deter Bitcoin’s widespread propagation. In such inflexible models, it is more efficient to roll-out a new project rather than perpetually fix the original project.

In this paper, we study and propose solutions to the main deficiencies of Bitcoin. We believe that a system taking into account the solutions we propose will lead to a healthy competition among different electronic cash systems. We also propose our own electronic cash, “CryptoNote”, a name emphasizing the next breakthrough in electronic cash.

 

Bitcoin drawbacks and some possible solutions

Traceability of transactions

Privacy and anonymity are the most important aspects of electronic cash. Peer-to-peer payments seek to be concealed from third party’s view, a distinct difference when compared with traditional banking. In particular, T. Okamoto and K. Ohta described six criteria of ideal electronic cash, which included “privacy: relationship between the user and his purchases must be untraceable by anyone” [30]. From their description, we derived two properties which a fully anonymous electronic cash model must satisfy in order to comply with the requirements outlined by Okamoto and Ohta:

Untraceability: for each incoming transaction all possible senders are equiprobable.

Unlinkability: for any two outgoing transactions it is impossible to prove they were sent to the same person.

Unfortunately, Bitcoin does not satisfy the untraceability requirement. Since all the trans- actions that take place between the network’s participants are public, any transaction can be

unambiguously traced to a unique origin and final recipient. Even if two participants exchange funds in an indirect way, a properly engineered path-finding method will reveal the origin and final recipient.

It is also suspected that Bitcoin does not satisfy the second property. Some researchers stated ([33, 35, 29, 31]) that a careful blockchain analysis may reveal a connection between the users of the Bitcoin network and their transactions. Although a number of methods are disputed [25], it is suspected that a lot of hidden personal information can be extracted from the public database.

Bitcoin’s failure to satisfy the two properties outlined above leads us to conclude that it is not an anonymous but a pseudo-anonymous electronic cash system. Users were quick to develop solutions to circumvent this shortcoming. Two direct solutions were “laundering services” [2] and the development of distributed methods [3, 4]. Both solutions are based on the idea of mixing several public transactions and sending them through some intermediary address; which in turn suffers the drawback of requiring a trusted third party.

Recently, a more creative scheme was proposed by I. Miers et al. [28]: “Zerocoin”. Zerocoin utilizes a cryptographic one-way accumulators and zero-knoweldge proofs which permit users to “convert” bitcoins to zerocoins and spend them using anonymous proof of ownership instead of explicit public-key based digital signatures. However, such knowledge proofs have a constant but inconvenient size – about 30kb (based on today’s Bitcoin limits), which makes the proposal impractical. Authors admit that the protocol is unlikely to ever be accepted by the majority of Bitcoin users [5].

 

The proof-of-work function
Bitcoin creator Satoshi Nakamoto described the majority decision making algorithm as “one- CPU-one-vote” and used a CPU-bound pricing function (double SHA-256) for his proof-of-work scheme. Since users vote for the single history of transactions order [1], the reasonableness and consistency of this process are critical conditions for the whole system.
The security of this model suffers from two drawbacks. First, it requires 51% of the network’s mining power to be under the control of honest users. Secondly, the system’s progress (bug fixes, security fixes, etc…) require the overwhelming majority of users to support and agree to the changes (this occurs when the users update their wallet software) [6].Finally this same voting mechanism is also used for collective polls about implementation of some features [7].
This permits us to conjecture the properties that must be satisfied by the proof-of-work pricing function. Such function must not enable a network participant to have a significant advantage over another participant; it requires a parity between common hardware and high cost of custom devices. From recent examples [8], we can see that the SHA-256 function used in the Bitcoin architecture does not posses this property as mining becomes more efficient on GPUs and ASIC devices when compared to high-end CPUs.
Therefore, Bitcoin creates favourable conditions for a large gap between the voting power of participants as it violates the “one-CPU-one-vote” principle since GPU and ASIC owners posses a much larger voting power when compared with CPU owners. It is a classical example of the Pareto principle where 20% of a system’s participants control more than 80% of the votes.
One could argue that such inequality is not relevant to the network’s security since it is not the small number of participants controlling the majority of the votes but the honesty of these participants that matters. However, such argument is somewhat flawed since it is rather the possibility of cheap specialized hardware appearing rather than the participants’ honesty which poses a threat. To demonstrate this, let us take the following example. Suppose a malevolent individual gains significant mining power by creating his own mining farm through the cheap

hardware described previously. Suppose that the global hashrate decreases significantly, even for a moment, he can now use his mining power to fork the chain and double-spend. As we shall see later in this article, it is not unlikely for the previously described event to take place.

 

Irregular emission

Bitcoin has a predetermined emission rate: each solved block produces a fixed amount of coins. Approximately every four years this reward is halved. The original intention was to create a limited smooth emission with exponential decay, but in fact we have a piecewise linear emission function whose breakpoints may cause problems to the Bitcoin infrastructure.

When the breakpoint occurs, miners start to receive only half of the value of their previous reward. The absolute difference between 12.5 and 6.25 BTC (projected for the year 2020) may seem tolerable. However, when examining the 50 to 25 BTC drop that took place on November 28 2012, felt inappropriate for a significant number of members of the mining community. Figure 1 shows a dramatic decrease in the network’s hashrate in the end of November, exactly when the halving took place. This event could have been the perfect moment for the malevolent individual described in the proof-of-work function section to carry-out a double spending attack [36].

image

Fig. 1. Bitcoin hashrate chart (source: http://bitcoin.sipa.be)

 

Hardcoded constants

Bitcoin has many hard-coded limits, where some are natural elements of the original design (e.g. block frequency, maximum amount of money supply, number of confirmations) whereas other seem to be artificial constraints. It is not so much the limits, as the inability of quickly changing

them if necessary that causes the main drawbacks. Unfortunately, it is hard to predict when the constants may need to be changed and replacing them may lead to terrible consequences.

A good example of a hardcoded limit change leading to disastrous consequences is the block size limit set to 250kb1. This limit was sufficient to hold about 10000 standard transactions. In early 2013, this limit had almost been reached and an agreement was reached to increase the limit. The change was implemented in wallet version 0.8 and ended with a 24-blocks chain split and a successful double-spend attack [9]. While the bug was not in the Bitcoin protocol, but rather in the database engine it could have been easily caught by a simple stress test if there was no artificially introduced block size limit.

Constants also act as a form of centralization point. Despite the peer-to-peer nature of Bitcoin, an overwhelming majority of nodes use the official reference client [10] developed by a small group of people. This group makes the decision to implement changes to the protocol and most people accept these changes irrespective of their “correctness”. Some decisions caused heated discussions and even calls for boycott [11], which indicates that the community and the developers may disagree on some important points. It therefore seems logical to have a protocol with user-configurable and self-adjusting variables as a possible way to avoid these problems.

 

Bulky scripts

The scripting system in Bitcoin is a heavy and complex feature. It potentially allows one to create sophisticated transactions [12], but some of its features are disabled due to security concerns and some have never even been used [13]. The script (including both senders’ and receivers’ parts) for the most popular transaction in Bitcoin looks like this:

image

image

image

image

<sig> <pubKey> OP DUP OP HASH160 <pubKeyHash> OP EQUALVERIFY OP CHECKSIG.

The script is 164 bytes long whereas its only purpose is to check if the receiver possess the secret key required to verify his signature.

 

The CryptoNote Technology

Now that we have covered the limitations of the Bitcoin technology, we will concentrate on presenting the features of CryptoNote.

 

Untraceable Transactions

In this section we propose a scheme of fully anonymous transactions satisfying both untraceability and unlinkability conditions. An important feature of our solution is its autonomy: the sender is not required to cooperate with other users or a trusted third party to make his transactions; hence each participant produces a cover traffic independently.

 

Literature review

Our scheme relies on the cryptographic primitive called a group signature. First presented by

D. Chaum and E. van Heyst [19], it allows a user to sign his message on behalf of the group. After signing the message the user provides (for verification purposes) not his own single public

image

1This is so-called “soft limit” — the reference client restriction for creating new blocks. Hard maximum of

possible blocksize was 1 MB

key, but the keys of all the users of his group. A verifier is convinced that the real signer is a member of the group, but cannot exclusively identify the signer.

The original protocol required a trusted third party (called the Group Manager), and he was the only one who could trace the signer. The next version called a ring signature, introduced by Rivest et al. in [34], was an autonomous scheme without Group Manager and anonymity revocation. Various modifications of this scheme appeared later: linkable ring signature [26, 27, 17] allowed to determine if two signatures were produced by the same group member, traceable ring signature [24, 23] limited excessive anonymity by providing possibility to trace the signer of two messages with respect to the same metainformation (or “tag” in terms of [24]).

A similar cryptographic construction is also known as a ad-hoc group signature [16, 38]. It emphasizes the arbitrary group formation, whereas group/ring signature schemes rather imply a fixed set of members.

For the most part, our solution is based on the work “Traceable ring signature” by E. Fujisaki and K. Suzuki [24]. In order to distinguish the original algorithm and our modification we will call the latter a one-time ring signature, stressing the user’s capability to produce only one valid signature under his private key. We weakened the traceability property and kept the linkability only to provide one-timeness: the public key may appear in many foreign verifying sets and the private key can be used for generating a unique anonymous signature. In case of a double spend attempt these two signatures will be linked together, but revealing the signer is not necessary for our purposes.

 

Definitions

 

Elliptic curve parameters

As our base signature algorithm we chose to use the fast scheme EdDSA, which is developed and implemented by D.J. Bernstein et al. [18]. Like Bitcoin’s ECDSA it is based on the elliptic curve discrete logarithm problem, so our scheme could also be applied to Bitcoin in future.

Common parameters are:

q: a prime number; q = 2255 19;

d: an element of Fq; d = 121665/121666;

E: an elliptic curve equation; x2 + y2 = 1 + dx2y2; G: a base point; G = (x, 4/5);

l: a prime order of the base point; l = 2252 + 27742317777372353535851937790883648493;

Hs: a cryptographic hash function {0, 1}Fq;

Hp: a deterministic hash function E(Fq) E(Fq).

 

Terminology

Enhanced privacy requires a new terminology which should not be confused with Bitcoin entities.

private ec-key is a standard elliptic curve private key: a number a [1, l 1];

public ec-key is a standard elliptic curve public key: a point A = aG;

one-time keypair is a pair of private and public ec-keys;

private user key is a pair (a, b) of two different private ec-keys;

tracking key is a pair (a, B) of private and public ec-key (where B = bG and a /= b);

public user key is a pair (A, B) of two public ec-keys derived from (a, b);

standard address is a representation of a public user key given into human friendly string with error correction;

truncated address is a representation of the second half (point B) of a public user key given into human friendly string with error correction.

The transaction structure remains similar to the structure in Bitcoin: every user can choose several independent incoming payments (transactions outputs), sign them with the corresponding private keys and send them to different destinations.

Contrary to Bitcoin’s model, where a user possesses unique private and public key, in the proposed model a sender generates a one-time public key based on the recipient’s address and some random data. In this sense, an incoming transaction for the same recipient is sent to a one-time public key (not directly to a unique address) and only the recipient can recover the corresponding private part to redeem his funds (using his unique private key). The recipient can spend the funds using a ring signature, keeping his ownership and actual spending anonymous. The details of the protocol are explained in the next subsections.

 

Unlinkable payments

Classic Bitcoin addresses, once being published, become unambiguous identifier for incoming payments, linking them together and tying to the recipient’s pseudonyms. If someone wants to receive an “untied” transaction, he should convey his address to the sender by a private channel. If he wants to receive different transactions which cannot be proven to belong to the same owner he should generate all the different addresses and never publish them in his own pseudonym.

image

image

image

image

Public Private

Alice

Bob’s addr 1

Bob’s key 1

Bob

image image

Carol

Bob’s addr 2

Bob’s key 2

Fig. 2. Traditional Bitcoin keys/transactions model.

We propose a solution allowing a user to publish a single address and receive unconditional unlinkable payments. The destination of each CryptoNote output (by default) is a public key, derived from recipient’s address and sender’s random data. The main advantage against Bitcoin is that every destination key is unique by default (unless the sender uses the same data for each of his transactions to the same recipient). Hence, there is no such issue as “address reuse” by design and no observer can determine if any transactions were sent to a specific address or link two addresses together.

image

image

image

One-time key

Public Private

Alice

One-time key

Bob

image

Bob’s Address

Bob’s Key

One-time key

Carol

Fig. 3. CryptoNote keys/transactions model.

First, the sender performs a Diffie-Hellman exchange to get a shared secret from his data and half of the recipient’s address. Then he computes a one-time destination key, using the shared secret and the second half of the address. Two different ec-keys are required from the recipient for these two steps, so a standard CryptoNote address is nearly twice as large as a Bitcoin wallet address. The receiver also performs a Diffie-Hellman exchange to recover the corresponding secret key.

A standard transaction sequence goes as follows:

 

Alice wants to send a payment to Bob, who has published his standard address. She unpacks the address and gets Bob’s public key (A, B).

 

Alice generates a random r [1, l 1] and computes a one-time public key P = Hs(rA)G +

B.

image

 

Alice uses P as a destination key for the output and also packs value R = rG (as a part of the Diffie-Hellman exchange) somewhere into the transaction. Note that she can create other outputs with unique public keys: different recipients’ keys (Ai, Bi) imply different Pi even with the same r.

Transaction

Amount

Tx public key Tx output

R = rG

Sender’s ran- dom data

r

Destination key

P = Hs(rA)G + B

(A, B)

Receiver’s public key

Fig. 4. Standard transaction structure.

 

Alice sends the transaction.

 

Bob checks every passing transaction with his private key (a, b), and computes P 1 =

H

s(aR)G + B. If Alice’s transaction for with Bob as the recipient was among them, then aR = arG = rA and P 1 = P .

Transaction

H

 

Bob can recover the corresponding one-time private key: x = s(aR) + b, so as P = xG. He can spend this output at any time by signing a transaction with x.

Receiver’s private key

(a, b)

one-time private key

image

x = Hs(aR) + b

one-time public key

R

P I = Hs(aR)G + bG

P I =? P

Tx public key

Tx output

Destination key

Amount

Fig. 5. Incoming transaction check.

As a result Bob gets incoming payments, associated with one-time public keys which are

unlinkable for a spectator. Some additional notes:

When Bob “recognizes” his transactions (see step 5) he practically uses only half of his private information: (a, B). This pair, also known as the tracking key, can be passed to a third party (Carol). Bob can delegate her the processing of new transactions. Bob doesn’t need to explicitly trust Carol, because she can’t recover the one-time secret key p without Bob’s full private key (a, b). This approach is useful when Bob lacks bandwidth or computation power (smartphones, hardware wallets etc.).

In case Alice wants to prove she sent a transaction to Bob’s address she can either disclose r or use any kind of zero-knowledge protocol to prove she knows r (for example by signing the transaction with r).

H H

If Bob wants to have an audit compatible address where all incoming transaction are linkable, he can either publish his tracking key or use a truncated address. That address represent only one public ec-key B, and the remaining part required by the protocol is derived from it as follows: a = s(B) and A = s(B)G. In both cases every person is able to “recognize” all of Bob’s incoming transaction, but, of course, none can spend the funds enclosed within them without the secret key b.

 

One-time ring signatures

A protocol based on one-time ring signatures allows users to achieve unconditional unlinkability. Unfortunately, ordinary types of cryptographic signatures permit to trace transactions to their respective senders and receivers. Our solution to this deficiency lies in using a different signature type than those currently used in electronic cash systems.

We will first provide a general description of our algorithm with no explicit reference to electronic cash.

A one-time ring signature contains four algorithms: (GEN, SIG, VER, LNK): GEN: takes public parameters and outputs an ec-pair (P, x) and a public key I.

SIG: takes a message m, a set S1 of public keys {Pi}i

and a set S = S1 ∪ {Ps}.

s, a pair (Ps, xs) and outputs a signature σ

VER: takes a message m, a set S, a signature σ and outputs “true” or “false”.

LNK: takes a set I = {Ii}, a signature σ and outputs “linked” or “indep”.

The idea behind the protocol is fairly simple: a user produces a signature which can be checked by a set of public keys rather than a unique public key. The identity of the signer is indistinguishable from the other users whose public keys are in the set until the owner produces a second signature using the same keypair.

image

Private keys

x0

· · ·

xn

xi

· · ·

Ring Signature

sign

Public keys

P0

· · ·

Pi

· · ·

Pn

Fig. 6. Ring signature anonymity.

verify

H

GEN: The signer picks a random secret key x [1, l 1] and computes the corresponding public key P = xG. Additionally he computes another public key I = x p(P ) which we will call the “key image”.

SIG: The signer generates a one-time ring signature with a non-interactive zero-knowledge proof using the techniques from [21]. He selects a random subset S1 of n from the other users’ public keys Pi, his own keypair (x, P ) and key image I. Let 0 s n be signer’s secret index in S (so that his public key is Ps).

{ | } { |

He picks a random qi i = 0 . . . n and wi i = 0 . . . n, i

following transformations:

s} from (1 . . . l) and applies the

i

L = qiG, if i = s qiG + wiPi, if i /= s

i

R = qiHp(Pi), if i = s qiHp(Pi) + wiI, if i =/ s

The next step is getting the non-interactive challenge:

c = Hs(m, L1, . . . , Ln, R1, . . . , Rn)

Finally the signer computes the response:

),

ci =

wi, if i /= s

c

n

i=0

ci mod l, if i = s

c

n

i=0

ci mod l, if i = s

i

r = qi, if i /= s

qs csx mod l, if i = s

The resulting signature is σ = (I, c1, . . . , cn, r1, . . . , rn).

VER: The verifier checks the signature by applying the inverse transformations:

L1i = riG + ciPi

Ri1 = riHp(Pi) + ciI

Finally, the verifier checks if

ci = Hs(m, L0, . . . , Ln, R0, . . . , Rn) mod l

),n ?

i=0

1 1 1 1

i=0

If this equality is correct, the verifier runs the algorithm LNK. Otherwise the verifier rejects

the signature.

I

LNK: The verifier checks if I has been used in past signatures (these values are stored in the set ). Multiple uses imply that two signatures were produced under the same secret key.

The meaning of the protocol: by applying L-transformations the signer proves that he knows such x that at least one Pi = xG. To make this proof non-repeatable we introduce the key image as I = xHp(P ). The signer uses the same coefficients (ri, ci) to prove almost the same statement: he knows such x that at least one Hp(Pi) = I · x1.

If the mapping x I is an injection:

 

Nobody can recover the public key from the key image and identify the signer;

 

The signer cannot make two signatures with different I’s and the same x. A full security analysis is provided in Appendix A.

 

Standard CryptoNote transaction

By combining both methods (unlinkable public keys and untraceable ring signature) Bob achieves new level of privacy in comparison with the original Bitcoin scheme. It requires him to store only one private key (a, b) and publish (A, B) to start receiving and sending anonymous transactions. While validating each transaction Bob additionally performs only two elliptic curve multi- plications and one addition per output to check if a transaction belongs to him. For his every output Bob recovers a one-time keypair (pi, Pi) and stores it in his wallet. Any inputs can be circumstantially proved to have the same owner only if they appear in a single transaction. In

fact this relationship is much harder to establish due to the one-time ring signature.

With a ring signature Bob can effectively hide every input among somebody else’s; all possible spenders will be equiprobable, even the previous owner (Alice) has no more information than any observer.

When signing his transaction Bob specifies n foreign outputs with the same amount as his output, mixing all of them without the participation of other users. Bob himself (as well as anybody else) does not know if any of these payments have been spent: an output can be used in thousands of signatures as an ambiguity factor and never as a target of hiding. The double spend check occurs in the LNK phase when checking against the used key images set.

Bob can choose the ambiguity degree on his own: n = 1 means that the probability he has spent the output is 50% probability, n = 99 gives 1%. The size of the resulting signature increases linearly as O(n + 1), so the improved anonymity costs to Bob extra transaction fees. He also can set n = 0 and make his ring signature to consist of only one element, however this will instantly reveal him as a spender.

Transaction

Sender’s output

image

Output1

Tx input

Destination key

Output0

One-time keypair

Destination key

Outputi

. . .

Outputn

One-time private key

. . .

Destination key

Outputn

Foreign transactions

P, x

I = xHp(P )

Key image

Signatures

Ring Signature

Fig. 7. Ring signature generation in a standard transaction.

 

Egalitarian Proof-of-work

In this section we propose and ground the new proof-of-work algorithm. Our primary goal is to close the gap between CPU (majority) and GPU/FPGA/ASIC (minority) miners. It is appropriate that some users can have a certain advantage over others, but their investments should grow at least linearly with the power. More generally, producing special-purpose devices has to be as less profitable as possible.

 

Related works

The original Bitcoin proof-of-work protocol uses the CPU-intensive pricing function SHA-256. It mainly consists of basic logical operators and relies solely on the computational speed of processor, therefore is perfectly suitable for multicore/conveyer implementation.

However, modern computers are not limited by the number of operations per second alone, but also by memory size. While some processors can be substantially faster than others [8], memory sizes are less likely to vary between machines.

Memory-bound price functions were first introduced by Abadi et al and were defined as “functions whose computation time is dominated by the time spent accessing memory” [15]. The main idea is to construct an algorithm allocating a large block of data (“scratchpad”) within memory that can be accessed relatively slowly (for example, RAM) and “accessing an unpredictable sequence of locations” within it. A block should be large enough to make preserving the data more advantageous than recomputing it for each access. The algorithm also should prevent internal parallelism, hence N simultaneous threads should require N times more memory at once.

Dwork et al [22] investigated and formalized this approach leading them to suggest another variant of the pricing function: “Mbound”. One more work belongs to F. Coelho [20], who

proposed the most effective solution: “Hokkaido”.

To our knowledge the last work based on the idea of pseudo-random searches in a big array is the algorithm known as “scrypt” by C. Percival [32]. Unlike the previous functions it focuses on key derivation, and not proof-of-work systems. Despite this fact scrypt can serve our purpose: it works well as a pricing function in the partial hash conversion problem such as SHA-256 in Bitcoin.

By now scrypt has already been applied in Litecoin [14] and some other Bitcoin forks. How- ever, its implementation is not really memory-bound: the ratio “memory access time / overall time” is not large enough because each instance uses only 128 KB. This permits GPU miners to be roughly 10 times more effective and continues to leave the possibility of creating relatively cheap but highly-efficient mining devices.

image

2

·

Moreover, the scrypt construction itself allows a linear trade-off between memory size and CPU speed due to the fact that every block in the scratchpad is derived only from the previous. For example, you can store every second block and recalculate the others in a lazy way, i.e. only when it becomes necessary. The pseudo-random indexes are assumed to be uniformly distributed, hence the expected value of the additional blocks’ recalculations is 1 N , where N is the number of iterations. The overall computation time increases less than by half because there are also time independent (constant time) operations such as preparing the scratchpad and hashing on

every iteration. Saving 2/3 of the memory costs 1 · N + 1 · 2 · N = N additional recalculations;

image image

image

3 3

· · ·

9/ 1 1 1

10 results in 10 N + . . . + 10 9 N = 4.5N . It is easy to show that storing only s of all blocks

2

increases the time less than by a factor of s1 . This in turn implies that a machine with a CPU

200 times faster than the modern chips can store only 320 bytes of the scratchpad.

 

The proposed algorithm

We propose a new memory-bound algorithm for the proof-of-work pricing function. It relies on random access to a slow memory and emphasizes latency dependence. As opposed to scrypt every new block (64 bytes in length) depends on all the previous blocks. As a result a hypothetical “memory-saver” should increase his calculation speed exponentially.

Our algorithm requires about 2 Mb per instance for the following reasons:

 

It fits in the L3 cache (per core) of modern processors, which should become mainstream in a few years;

 

A megabyte of internal memory is an almost unacceptable size for a modern ASIC pipeline;

 

GPUs may run hundreds of concurrent instances, but they are limited in other ways: GDDR5 memory is slower than the CPU L3 cache and remarkable for its bandwidth, not random access speed.

 

Significant expansion of the scratchpad would require an increase in iterations, which in turn implies an overall time increase. “Heavy” calls in a trust-less p2p network may lead to serious vulnerabilities, because nodes are obliged to check every new block’s proof-of-work. If a node spends a considerable amount of time on each hash evaluation, it can be easily DDoSed by a flood of fake objects with arbitrary work data (nonce values).

 

Further advantages

 

Smooth emission

The upper bound for the overall amount of CryptoNote digital coins is: MSupply = 264 1

atomic units. This is a natural restriction based only on implementation limits, not on intuition such as “N coins ought to be enough for anybody”.

To ensure the smoothness of the emission process we use the following formula for block rewards:

BaseReward = (MSupply A) » 18,

where A is amount of previously generated coins.

 

Adjustable parameters

 

Difficulty

CryptoNote contains a targeting algorithm which changes the difficulty of every block. This decreases the system’s reaction time when the network hashrate is intensely growing or shrinking, preserving a constant block rate. The original Bitcoin method calculates the relation of actual and target time-span between the last 2016 blocks and uses it as the multiplier for the current difficulty. Obviously this is unsuitable for rapid recalculations (because of large inertia) and results in oscillations.

The general idea behind our algorithm is to sum all the work completed by the nodes and divide it by the time they have spent. The measure of work is the corresponding difficulty values in each block. But due to inaccurate and untrusted timestamps we cannot determine the exact time interval between blocks. A user can shift his timestamp into the future and the next time intervals might be improbably small or even negative. Presumably there will be few incidents of this kind, so we can just sort the timestamps and cut-off the outliers (i.e. 20%). The range of the rest values is the time which was spent for 80% of the corresponding blocks.

 

Size limits

Users pay for storing the blockchain and shall be entitled to vote for its size. Every miner deals with the trade-off between balancing the costs and profit from the fees and sets his own “soft-limit” for creating blocks. Also the core rule for the maximum block size is necessary for preventing the blockchain from being flooded with bogus transaction, however this value should not be hard-coded.

·

Let MN be the median value of the last N blocks sizes. Then the “hard-limit” for the size of accepting blocks is 2 MN . It averts the blockchain from bloating but still allows the limit to slowly grow with time if necessary.

Transaction size does not need to be limited explicitly. It is bounded by the size of a block; and if somebody wants to create a huge transaction with hundreds of inputs/outputs (or with the high ambiguity degree in ring signatures), he can do so by paying sufficient fee.

 

Excess size penalty

A miner still has the ability to stuff a block full of his own zero-fee transactions up to its maximum size 2 · Mb. Even though only the majority of miners can shift the median value, there is still a

possibility to bloat the blockchain and produce an additional load on the nodes. To discourage malevolent participants from creating large blocks we introduce a penalty function:

NewReward = BaseReward ·

BlkSize 2

(

 

\

1

MN

·

This rule is applied only when BlkSize is greater than minimal free block size which should be close to max(10kb, MN 110%). Miners are permitted to create blocks of “usual size” and even exceed it with profit when the overall fees surpass the penalty. But fees are unlikely to grow quadratically unlike the penalty value so there will be an equilibrium.

 

 

Transaction scripts

CryptoNote has a very minimalistic scripting subsystem. A sender specifies an expression Φ =

i=1

f (x1, x2, . . . , xn), where n is the number of destination public keys {Pi}n . Only five binary

operators are supported: min, max, sum, mul and cmp. When the receiver spends this payment, he produces 0 k n signatures and passes them to transaction input. The verification process simply evaluates Φ with xi = 1 to check for a valid signature for the public key Pi, and xi = 0. A verifier accepts the proof iff Φ > 0.

Despite its simplicity this approach covers every possible case:

 

≤ ≤ ≥

 

Multi-/Threshold signature. For the Bitcoin-style “M-out-of-N” multi-signature (i.e. the receiver should provide at least 0 M N valid signatures) Φ = x1+x2+. . .+xN M (for clarity we are using common algebraic notation). The weighted threshold signature (some keys can be more important than other) could be expressed as Φ = w1 · x1 + w2 · x2 + . . . + wN · xN wM . And scenario where the master-key corresponds to Φ = max(M x, x1 + x2 + . . . + xN ) M . It is easy to show that any sophisticated case can be expressed with these operators, i.e. they form basis.

Password protection. Possession of a secret password s is equivalent to the knowledge of a private key, deterministically derived from the password: k = KDF(s). Hence, a receiver can prove that he knows the password by providing another signature under the key k. The sender simply adds the corresponding public key to his own output. Note that this method is much more secure than the “transaction puzzle” used in Bitcoin [13], where the password is explicitly passed in the inputs.

Degenerate cases. Φ = 1 means that anybody can spend the money; Φ = 0 marks the output as not spendable forever.

In the case when the output script combined with public keys is too large for a sender, he can use special output type, which indicates that the recipient will put this data in his input while the sender provides only a hash of it. This approach is similar to Bitcoin’s “pay-to-hash” feature, but instead of adding new script commands we handle this case at the data structure level.

 

Conclusion

We have investigated the major flaws in Bitcoin and proposed some possible solutions. These ad- vantageous features and our ongoing development make new electronic cash system CryptoNote a serious rival to Bitcoin, outclassing all its forks.

Nobel prize laureate Friedrich Hayek in his famous work proves that the existence of con- current independent currencies has a huge positive effect. Each currency issuer (or developer in our case) is trying to attract users by improving his product. Currency is like a commodity: it can have unique benefits and shortcomings and the most convenient and trusted currency has the greatest demand. Suppose we had a currency excelling Bitcoin: it means that Bitcoin would develop faster and become better. The biggest support as an open source project would come from its own users, who are interested in it.

We do not consider CryptoNote as a full replacement to Bitcoin. On the contrary, having two (or more) strong and convenient currencies is better than having just one. Running two and more different projects in parallel is the natural flow of electronic cash economics.

Nicolas van

Digitally signed by Nicolas van Saberhagen

image

Saberhagen

DN: cn=Nicolas van Saberhagen, email=nvsaberhagen@gmail.co

m

Date: 2013.10.17 14:44:50

+02’00’

A Security

We shall give a proof for our one-time ring signature scheme. At some point it coincides with the parts of the proof in [24], but we decided to rewrite them with a reference rather than to force a reader to rush about from one paper to another.

These are the properties to be established:

i=1

 

Linkability. Given all the secret keys {xi}n for a set S it is impossible to produce n + 1

valid signatures σ1, σ2, . . . , σn+1, such that all of them pass the LNK phase (i.e. with n + 1 different key images Ii). This property implies the double spending protection in the context of CryptoNote.

 

Exculpability. Given set S, at most n 1 corresponding private keys xi (excluding i = j) and the image Ij of the keys xj it is impossible to produce a valid signature σ with Ij. This property implies theft protection in the context of CryptoNote.

 

S

Unforgeability. Given only a public keys set it is impossible to produce a valid signature

σ.

n

 

S

Anonymity. Given a signature σ and the corresponding set it is impossible to determine the secret index j of the signer with a probability p > 1 .

Linkability

Theorem 1. Our one-time ring signature scheme is linkable under the random oracle model.

Proof. Suppose an adversary can produce n + 1 valid signatures σi with key images Ii /= Ij for any i, j [1 . . . n]. Since #S = n, at least one Ii /= xiHp(Pi) for every i. Consider the corresponding signature σ = (I, c1, . . . , cn, r1, . . . , rn). VER(σ) = “true”, this means that

L1i = riG + ciPi

Ri1 = riHp(Pi) + ciI

ci = Hs(m, L1, . . . , Ln, R1, . . . , Rn) mod l

),n

1 1 1 1

i=1

The first two equalities imply

logG L1i = ri + cixi

logHp(Pi) Ri1 = ri + ci logHp(Pi) I

where logA B informally denotes the discrete logarithm of B to the base A.

image

H

As in [24] we note that i : xi = logHp(Pi) I implies that all ci’s are uniquely determined. The third equality forces the adversary to find a pre-image of s to succeed in the attack, an event whose probability is considered to be negligible.

Exculpability

Theorem 2. Our one-time ring signature scheme is exculpable under the discrete logarithm assumption in the random oracle model.

Proof. Suppose an adversary can produce a valid signature σ = (I, c1, . . . , cn, r1, . . . , rn) with I = xjHP (Pj) with given {xi | i = 1, . . . , j 1, j +1, . . . , n}. Then, we can construct an algorithm A which solves the discrete logarithm problem in E(Fq).

S

Suppose inst = (G, P ) E(Fq) is a given instance of the DLP and the goal is to get s, such that P = sG. Using the standard technique (as in [24]), A simulates the random and signing oracles and makes the adversary produce two valid signatures with Pj = P in the set : σ = (I, c1, . . . , cn, r1, . . . , rn) and σ1 = (I, c11, . . . , c1n, r11 , . . . , rn1 ).

Since I = x H (P ) in both signatures we compute x

= log

I = rj rjt

image

image

mod l

j p j

j Hp(Pj )

ctj cj

A outputs xj because Lj = rjG + cjPj = rj1 G + c1j Pj and Pj = P .

Unforgeability

It has been shown in [24] that unforgeability is just an implication of both linkability and excul- pability.

Theorem 3. If a one-time ring signature scheme is linkable and exculpable, then it is unforgeable.

Proof. Suppose an adversary can forge a signature for a given set S: σ0 = (I0, . . .). Consider all valid signatures (produced by the honest signers) for the same message m and the set S: σ1, σ2, . . . , σn. There are two possible cases:

i=1

1. I0 ∈ {Ii}n

i=1

2. I0 /∈ {Ii}n

. Which contradicts exculpability.

image

. Which contradicts linkability.

Anonymity

Theorem 4. Our one-time ring signature scheme is anonymous under the decisional Diffie- Hellman assumption in the random oracle model.

Proof. Suppose an adversary can determine the secret index j of the Signer with a probability

p = 1 + E. Then, we can construct algorithm A which solves the decisional Diffie-Hellman

image image

n 1 E

problem in E(Fq) with the probability 2 + 2 .

Let inst = (G1, G2, Q1, Q2) E(Fq) be the instance of DDH and the goal to determine if

logG1 Q1 = logG2 Q2. A feeds the adversary with valid signature σ0 = (I, . . .), where Pj =

xjG1 = Q1 and I = Q2 and simulates oracle Hp, returning G2 for query Hp(Pj).

image

image

2

2

The adversary returns k as his guess for the index i : I = xiHP (Pi). If k = j, then A returns 1 (for “yes”) otherwise a random r ∈ {1, 0}. The probability of the right choice is com- puted as in [24]: 1 +Pr (1 | inst DDH)Pr (1 | inst / DDH) = 1 +Pr (k = j | inst DDH)+

Pr (k /= j | inst DDH)·Pr (r = 1)Pr (k = j | inst / DDH)Pr (k /= j | inst / DDH)·Pr (r = 0) =

image

image

image

image

image

2

n

n

2

n

n

2

2

2

1 + 1 + E + ( n1 E) · 1 1n1 · 1 = 1 + E

image

H

In fact, the result should be reduced by the probability of collision in s, but this value is considered to be negligible.

Notes on the hash function Hp

H → H

We defined p as deterministic hash function E(Fq) E(Fq). None of the proofs demands p to be an ideal cryptographic hash function. It’s main purpose is to get a pseudo-random base for image key I = xHp(xG) in some determined way.

With fixed base (I = xG2) the following scenario is possible:

 

Alice sends two standard transactions to Bob, generating one-time tx-keys: P2 = Hs(r1A)G+

B and P1 = Hs(r2A)G + B.

 

Bob recovers corresponding one-time private tx-keys x1 and x2 and spends the outputs with valid signatures and images keys I1 = x1G2 and I2 = x2G2.

 

Now Alice can link these signatures, checking the equality I1I2 =? (Hs(r1A)−Hs(r2A))G2.

The problem is that Alice knows the linear correlation between public keys P1 and P2 and in case of fixed base G2 she also gets the same correlation between key images I1 and I2. Replacing G2 with Hp(xG2), which does not preserve linearity, fixes that flaw.

For constructing deterministic Hp we use algorithm presented in [37].

References

 

http://bitcoin.org.

image

 

https://en.bitcoin.it/wiki/Category:Mixing Services.

 

http://blog.ezyang.com/2012/07/secure-multiparty-bitcoin-anonymization.

 

https://bitcointalk.org/index.php?topic=279249.0.

 

http://msrvideo.vo.msecnd.net/rmcvideos/192058/dl/192058.pdf.

 

https://github.com/bitcoin/bips/blob/master/bip-0034.mediawiki#Specification.

image

 

https://github.com/bitcoin/bips/blob/master/bip-0016.mediawiki#Backwards Compatibility.

image

image

 

https://en.bitcoin.it/wiki/Mining hardware comparison.

 

https://github.com/bitcoin/bips/blob/master/bip-0050.mediawiki.

 

http://luke.dashjr.org/programs/bitcoin/files/charts/branches.html.

 

https://bitcointalk.org/index.php?topic=196259.0.

 

https://en.bitcoin.it/wiki/Contracts.

 

https://en.bitcoin.it/wiki/Script.

 

http://litecoin.org.

 

Mart´ın Abadi, Michael Burrows, and Ted Wobber. Moderately hard, memory-bound func- tions. In NDSS, 2003.

 

Ben Adida, Susan Hohenberger, and Ronald L. Rivest. Ad-hoc-group signatures from hi- jacked keypairs. In in DIMACS Workshop on Theft in E-Commerce, 2005.

 

Man Ho Au, Sherman S. M. Chow, Willy Susilo, and Patrick P. Tsang. Short linkable ring signatures revisited. In EuroPKI, pages 101–115, 2006.

 

Daniel J. Bernstein, Niels Duif, Tanja Lange, Peter Schwabe, and Bo-Yin Yang. High-speed high-security signatures. J. Cryptographic Engineering, 2(2):77–89, 2012.

 

David Chaum and Eug`ene van Heyst. Group signatures. In EUROCRYPT, pages 257–265, 1991.

 

Fabien Coelho. Exponential memory-bound functions for proof of work protocols. IACR Cryptology ePrint Archive, 2005:356, 2005.

 

Ronald Cramer, Ivan Damg˚ard, and Berry Schoenmakers. Proofs of partial knowledge and simplified design of witness hiding protocols. In CRYPTO, pages 174–187, 1994.

 

Cynthia Dwork, Andrew Goldberg, and Moni Naor. On memory-bound functions for fighting spam. In CRYPTO, pages 426–444, 2003.

 

Eiichiro Fujisaki. Sub-linear size traceable ring signatures without random oracles. In CT- RSA, pages 393–415, 2011.

 

Eiichiro Fujisaki and Koutarou Suzuki. Traceable ring signature. In Public Key Cryptogra- phy, pages 181–200, 2007.

 

Jezz Garzik. Peer review of “quantitative analysis of the full bitcoin transaction graph”. https://gist.github.com/3901921, 2012.

 

Joseph K. Liu, Victor K. Wei, and Duncan S. Wong. Linkable spontaneous anonymous group signature for ad hoc groups (extended abstract). In ACISP, pages 325–335, 2004.

 

Joseph K. Liu and Duncan S. Wong. Linkable ring signatures: Security models and new schemes. In ICCSA (2), pages 614–623, 2005.

 

Ian Miers, Christina Garman, Matthew Green, and Aviel D. Rubin. Zerocoin: Anonymous distributed e-cash from bitcoin. In IEEE Symposium on Security and Privacy, pages 397– 411, 2013.

 

Micha Ober, Stefan Katzenbeisser, and Kay Hamacher. Structure and anonymity of the bitcoin transaction graph. Future internet, 5(2):237–250, 2013.

 

Tatsuaki Okamoto and Kazuo Ohta. Universal electronic cash. In CRYPTO, pages 324–337, 1991.

 

Marc Santamaria Ortega. The bitcoin transaction graph — anonymity. Master’s thesis, Universitat Oberta de Catalunya, June 2013.

 

Colin Percival. Stronger key derivation via sequential memory-hard functions. Presented at BSDCan’09, May 2009.

 

Fergal Reid and Martin Harrigan. An analysis of anonymity in the bitcoin system. CoRR, abs/1107.4524, 2011.

 

Ronald L. Rivest, Adi Shamir, and Yael Tauman. How to leak a secret. In ASIACRYPT, pages 552–565, 2001.

 

Dorit Ron and Adi Shamir. Quantitative analysis of the full bitcoin transaction graph.

IACR Cryptology ePrint Archive, 2012:584, 2012.

 

Meni Rosenfeld. Analysis of hashrate-based double-spending. 2012.

 

Maciej Ulas. Rational points on certain hyperelliptic curves over finite fields. Bulletin of the Polish Academy of Sciences. Mathematics, 55(2):97–104, 2007.

 

Qianhong Wu, Willy Susilo, Yi Mu, and Fangguo Zhang. Ad hoc group signatures. In

IWSEC, pages 120–135, 2006.

 

 

Zcash

Highlight to Annotate. Click to Learn.

 

Zerocash: Decentralized Anonymous Payments from Bitcoin

(extended version)

Eli Ben-Sasson Alessandro Chiesa Christina Garman Matthew Green Ian Miers Eran Tromer§ Madars Virza

May 18, 2014

Abstract

Bitcoin is the first digital currency to see widespread adoption. Although payments are conducted between pseudonyms, Bitcoin cannot offer strong privacy guarantees: payment transactions are recorded in a public decentralized ledger, from which much information can be deduced. Zerocoin (Miers et al., IEEE S&P 2013) tackles some of these privacy issues by unlinking transactions from the payment’s origin. Yet it still reveals payment destinations and amounts, and is limited in functionality.

In this paper, we construct a full-fledged ledger-based digital currency with strong privacy guarantees. Our results leverage recent advances in zero-knowledge Succinct Non-interactive ARguments of Knowledge (zk-SNARKs).

We formulate and construct decentralized anonymous payment schemes (DAP schemes). A DAP scheme lets users pay each other directly and privately: the corresponding transaction hides the payment’s origin, destination, and amount. We provide formal definitions and proofs of the construction’s security.

We then build Zerocash, a practical instantiation of our DAP scheme construction. In Zerocash, transactions are less than 1 kB and take under 6 ms to verify — orders of magnitude more efficient than the less-anonymous Zerocoin and competitive with plain Bitcoin.

Keywords: Bitcoin, decentralized electronic cash, zero-knowledge proofs

∗Technion, eli@cs.technion.ac.il

†MIT, {alexch, madars}@mit.edu

‡Johns Hopkins University, {cgarman, imiers, mgreen}@cs.jhu.edu

Tel Aviv University, tromer@cs.tau.ac.il

§

Contents

  1. Introduction 3
    1. zk-SNARKs 4
    2. Centralized anonymous payment systems 5
    3. Decentralized anonymous payment schemes 5
    4. Zerocash 9
    5. Paper organization 10
  2. Background on zk-SNARKs 10
    1. Informal definition 10
    2. Comparison with NIZKs 11
    3. Known constructions and security 12
    4. zk-SNARK implementations 12
  3. Definition of a decentralized anonymous payment scheme 13
    1. Data structures 13
    2. Algorithms 14
    3. Completeness 16
    4. Security 16
  4. Construction of a decentralized anonymous payment scheme 18
    1. Cryptographic building blocks 18
    2. zk-SNARKs for pouring coins 19
    3. Algorithm constructions 20
    4. Completeness and security 20
  5. Zerocash 20
    1. Instantiation of building blocks 22
    2. Arithmetic circuit for pouring coins 23
  6. Integration with existing ledger-based currencies 26
    1. Integration by replacing the base currency 26
    2. Integration by hybrid currency 26
    3. Extending the Bitcoin protocol to support the combined semantics 28
    4. Additional anonymity considerations 28
  7. Experiments 28
    1. Performance of zk-SNARKs for pouring coins 29
    2. Performance of Zerocash algorithms 29
    3. Large-scale network simulation 30
  8. Optimizations and extensions 33
    1. Everlasting anonymity 33
    2. Fast block propagation 34
    3. Improved storage requirements 34
  9. Concurrent work 36
  10. Conclusion 36

Acknowledgments 37

  1. Overview of Bitcoin and Zerocoin 38
    1. Bitcoin 38
    2. Zerocoin 38
  2. Completeness of DAP schemes 39
  3. Security of DAP schemes 40
    1. Ledger indistinguishability 41
    2. Transaction non-malleability 42
    3. Balance 43
  4. Proof of Theorem 4.1 44
    1. Proof of ledger indistinguishability 44
    2. Proof of transaction non-malleability 48
    3. Proof of balance 51

References 54

1 Introduction

Bitcoin is the first digital currency to achieve widespread adoption. The currency owes its rise in part to the fact that, unlike traditional e-cash schemes [Cha82, CHL05, ST99], it requires no trusted parties. Instead of appointing a central bank, Bitcoin uses a distributed ledger known as the block chain to store transactions carried out between users. Because the block chain is massively replicated by mutually-distrustful peers, the information it contains is public.

While users may employ many identities (or pseudonyms) to enhance their privacy, an increasing body of research shows that anyone can de-anonymize Bitcoin by using information in the block chain [RM11, BBSU12, RS12, MPJ+13], such as the structure of the transaction graph as well as the value and dates of transactions. As a result, Bitcoin fails to offer even a modicum of the privacy provided by traditional payment systems, let alone the robust privacy of anonymous e-cash schemes. While Bitcoin is not anonymous itself, those with sufficient motivation can obfuscate their transaction history with the help of mixes (also known as laundries or tumblers). A mix allows users to entrust a set of coins to a pool operated by a central party and then, after some interval, retrieve different coins (with the same total value) from the pool. However, mixes suffer from three limitations: (i) the delay to reclaim coins must be large to allow enough coins to be mixed in; (ii) the mix operator can trace coins; and (iii) the mix operator may steal coins.1 For users with “something to hide”, these risks may be acceptable. But typical legitimate users (1) wish to keep their spending habits private from their peers, (2) are risk-averse and do not wish to expend continual effort in protecting their privacy, and (3) are often not sufficiently aware that their privacy

has been compromised.

To protect their privacy, users thus need an instant, risk-free, and, most importantly, automatic guarantee that data revealing their spending habits and account balances is not publicly accessible by their neighbors, co-workers, and the merchants with whom they do business. Anonymous transactions also ensure that the market value of a coin is independent of its history, thus ensuring that legitimate users’ coins remain fungible.2

Zerocoin: a decentralized mix. Miers et al. [MGGR13] proposed Zerocoin, which extends Bitcoin to provide strong anonymity guarantees. Like many e-cash protocols (e.g., [CHL05]), Zerocoin employs zero-knowledge proofs to prevent transaction graph analyses. Unlike earlier practical e-cash protocols, however, Zerocoin does not rely on digital signatures to validate coins, nor does it require a central bank to prevent double spending. Instead, Zerocoin authenticates coins by proving, in zero-knowledge, that they belong to a public list of valid coins (which can be maintained on the block chain). Yet rather than a full-fledged anonymous currency, Zerocoin is a decentralized mix, where users may periodically “wash” their bitcoins via the Zerocoin protocol. Routine day-to-day transactions must be conducted via Bitcoin, due to reasons that we now review.

The first reason is performance. Redeeming zerocoins requires double-discrete-logarithm proofs of knowledge, which have size that exceeds 45 kB and require 450 ms to verify (at the 128-bit security level).3 These proofs must be broadcast through the network, verified by every node, and permanently stored in the ledger. The entailed costs are higher, by orders of magnitude, than those in Bitcoin and can seriously tax a Bitcoin network operating at normal scale.

1CoinJoin [Max13], an alternative proposal, replaces the central party of a mix with multi-signature transactions that involve many collaborating Bitcoin users. CoinJoin can thus only mix small volumes of coins amongst users who are currently online, is prone to denial-of-service attacks by third parties, and requires effort to find mixing partners. 2While the methods we detail in this paper accomplish this, the same techniques open the door for privacy-preserving

accountability and oversight (see Section 10).

3These published numbers [MGGR13] actually use a mix of parameters at both 128-bit and 80-bit security for different components of the construction. The cost is higher if all parameters are instantiated at 128-bit security.

The second reason is functionality. While Zerocoin constitutes a basic e-cash scheme, it lacks critical features required of full-fledged anonymous payments. First, Zerocoin uses coins of fixed denomination: it does not support payments of exact values, nor does it provide a means to give change following a transaction (i.e., divide coins). Second, Zerocoin has no mechanism for one user to pay another one directly in “zerocoins”. And third, while Zerocoin provides anonymity by unlinking a payment transaction from its origin address, it does not hide the amount or other metadata about transactions occurring on the network.

Our contribution. Addressing this challenge, this work offers two main contributions.

  1. We introduce the notion of a decentralized anonymous payment scheme, which formally captures the functionality and security guarantees of a full-fledged decentralized electronic currency with strong anonymity guarantees. We provide a construction of this primitive and prove its security under specific cryptographic assumptions. The construction leverages recent advances in the area of zero-knowledge proofs. Specifically, it uses zero-knowledge Succinct Non-interactive ARguments of Knowledge (zk-SNARKs) [Gro10, Lip12, BCI+13, GGPR13, PGHR13, BCG+13, Lip13, BCTV14].
  2. We implement the above primitive, via a system that we call Zerocash. Our system (at 128 bits of security):
  • reduces the size of transactions spending a coin to under 1 kB (an improvement of over 97.7%);
  • reduces the spend-transaction verification time to under 6 ms (an improvement of over 98.6%);
  • allows for anonymous transactions of variable amounts;
  • hides transaction amounts and the values of coins held by users; and
  • allows for payments to be made directly to a user’s fixed address (without user interaction).

To validate our system, we measured its performance and established feasibility by conducting

experiments in a test network of 1000 nodes (approximately 1

16

of the unique IPs in the Bitcoin

network and 1

3

of the nodes reachable at any given time [DW13]). This inspires confidence that

Zerocash can be deployed as a fork of Bitcoin and operate at the same scale. Thus, due to its substantially improved functionality and performance, Zerocash makes it possible to entirely replace traditional Bitcoin payments with anonymous alternatives.

Concurrent work. The idea of using zk-SNARKs in the Bitcoin setting was first presented by one of the authors at Bitcoin 2013 [Ben13]. In concurrent work, Danezis et al. [DFKP13] suggest using zk-SNARKs to reduce proof size and verification time in Zerocoin; see Section 9 for a comparison.

zk-SNARKs

A zk-SNARK is an efficient variant of a zero-knowledge proof of knowledge [GMR89], which we first informally describe via an example. Suppose Alice wishes to prove to Bob the statement “I (Alice) own 30 bitcoins”. A simple method for Alice to do so is to point to 30 coins on the block chain and, for each of them, sign a message (“hello, world”) using the secret key that controls that coin. Alas, this method leaks knowledge to Bob, by identifying which coins are Alice’s. A zero-knowledge proof of knowledge allows Alice to achieve the same goal, while revealing no information to Bob (beyond the fact that she knows some secret keys that control 30 coins). Crucially, such proofs can be obtained for any statement that can be verified to be true by use of an efficient computation involving auxiliary inputs such as trapdoors and passwords (such statements are called “NP statements”).

We now sketch in more technical terms the definition of a zk-SNARK; see Section 2 for more details. A zk-SNARK is a non-interactive zero-knowledge proof of knowledge that is succinct, i.e., for which proofs are very short and easy to verify. More precisely, let L be an NP language, and let C be a nondeterministic decision circuit for L on a given instance size n. A zk-SNARK can be used

to prove and verify membership in L, for instances of size n, as follows. After taking C as input, a trusted party conducts a one-time setup phase that results in two public keys: a proving key pk and a verification key vk. The proving key pk enables any (untrusted) prover to produce a proof π attesting to the fact that x ∈ L, for an instance x (of size n) of his choice. The non-interactive proof π is zero knowledge and a proof of knowledge. Anyone can use the verification key vk to verify the proof π; in particular zk-SNARK proofs are publicly verifiable: anyone can verify π, without ever having to interact with the prover who generated π. Succinctness requires that (for a given security level) π has constant size and can be verified in time that is linear in |x| (rather than linear in |C|).

Centralized anonymous payment systems

Before describing our new decentralized payment system, we put it in context by recalling two pre-Bitcoin payment schemes, both of which relied on a bank, acting as a central trusted party.

Anonymous e-cash. Chaum [Cha82] first obtained anonymous e-cash. In Chaum’s scheme, the minting of a coin involves both a user, Alice, and the bank: to mint a coin of a given value v, Alice first selects a random secret serial number sn (unknown to the bank); then, the bank, after deducting v from Alice’s balance, signs sn via a blind signature. Afterwards, if Alice wants to transfer her coin to Bob, she reveals sn to him and proves that sn was signed by the bank; during this transfer, Bob (or the bank) cannot deduce Alice’s identity from the revealed information. Double-spending is prevented because the bank will not honor a coin with a previously-seen serial number.

Unforgeable e-cash. One problem with Chaum’s scheme is that coins can be forged if the bank’s secret key is compromised. Sander and Ta-Shma [ST99] addressed this, as follows. The bank maintains a public Merkle tree of “coin commitments”, and users periodically retrieve its root rt; in particular, the bank maintains no secrets. When Alice requests a coin (of unit value), she picks a random serial number sn and auxiliary string r, and then sends cm := CRH(sn”r) to the bank, where CRH is a collision-resistant hash; the bank deducts the appropriate amount from Alice’s balance and then records cm as a leaf in the Merkle tree. Afterwards, to pay Bob, Alice sends him sn along with a zero-knowledge proof of knowledge π of the following NP statement: “there exists r such that CRH(sn”r) is a leaf in a Merkle tree with root rt. In other words, Alice can convince Bob that sn is the serial number contained in some coin commitment in the Merkle tree; but the zero-knowledge property prevents Bob from learning information about which coin commitment is Alice’s, thereby protecting Alice’s identity. Later, Bob can “cash out” Alice’s coin by showing sn and π to the bank.4

Moving to a fungible anonymous decentralized system. In this paper, like [ST99], we hash a coin’s serial number and use Merkle trees to compactly represent the set of minted coins. Unlike [ST99], we also ensure the privacy of a coin’s value and support transactions that split and merge coins, thus achieving (and implementing) a new kind of fully-fungible and divisible payment scheme. As in Bitcoin (and in stark contrast to previous e-cash schemes), we do not rely on a trusted bank. Therefore, we require a new set of definitions and protocols, designed to protect Alice’s anonymity while preventing her from falsely increasing her balance under the veil of her boosted privacy. An informal description of our payment scheme follows.

Decentralized anonymous payment schemes

We construct a decentralized anonymous payment (DAP) scheme, which is a decentralized e-cash scheme that allows direct anonymous payments of any amount. See Section 3 for a formal definition.

4We omit details about how the bank can identify Alice in the event that she double spends her coin.

Here, we outline our construction in six incremental steps; the construction details are in Section 4. Our construction functions on top of any ledger-based base currency, such as Bitcoin. At any given time, a unique valid snapshot of the currency’s ledger is available to all users. The ledger is a sequence of transactions and is append-only. Transactions include both the underlying currency’s transactions, as well as new transactions introduced by our construction. For concreteness, we focus the discussion below on Bitcoin (though later definitions and constructions are stated abstractly). We assume familiarity with Bitcoin [Nak09] and Zerocoin [MGGR13]; both are reviewed in Appendix A.

Step 1: user anonymity with ftxed-value coins. We first describe a simplified construction, in which all coins have the same value of, e.g., 1 BTC. This construction, similar to the Zerocoin protocol, shows how to hide a payment’s origin. In terms of tools, we make use of zk-SNARKs (recalled above) and a commitment scheme. Let COMM denote a statistically-hiding non-interactive commitment scheme (i.e., given randomness r and message m, the commitment is c := COMMr(m); subsequently, c is opened by revealing r and m, and one can verify that COMMr(m) equals c).

In the simplified construction, a new coin c is minted as follows: a user u samples a random serial number sn and a trapdoor r, computes a coin commitment cm := COMMr(sn), and sets c := (r, sn, cm). A corresponding mint transaction txMint, containing cm (but not sn or r), is sent to the ledger; txMint is appended to the ledger only if u has paid 1 BTC to a backing escrow pool (e.g., the 1 BTC may be paid via plaintext information encoded in txMint). Mint transactions are thus certificates of deposit, deriving their value from the backing pool.

Subsequently, letting CMList denote the list of all coin commitments on the ledger, u may spend c by posting a spend transaction txSpend that contains (i) the coin’s serial number sn; and (ii) a zk-SNARK proof π of the NP statement “I know r such that COMMr(sn) appears in the list CMList of coin commitments”. Assuming that sn does not already appear on the ledger (as part of a past spend transaction), u can redeem the deposited amount of 1 BTC, which u can either keep, transfer to someone else, or mint a new coin. (If sn does already appear on the ledger, this is considered double spending, and the transaction is discarded.)

User anonymity is achieved because the proof π is zero-knowledge: while sn is revealed, no information about r is, and finding which of the numerous commitments in CMList corresponds to a particular spend transaction txSpend is equivalent to inverting f (x) := COMMx(sn), which is assumed to be infeasible. Thus, the origin of the payment is anonymous.

Step 2: compressing the list of coin commitments. In the above NP statement, CMList is specified explicitly as a list of coin commitments. This naive representation severely limits scalability because the time and space complexity of most protocol algorithms (e.g., the proof verification algorithm) grow linearly with CMList. Moreover, coin commitments corresponding to already-spent coins cannot be dropped from CMList to reduce costs, since they cannot be identified (due to the same zero-knowledge property that provides anonymity).

As in [ST99], we rely on a collision-resistant function CRH to avoid an explicit representation of CMList. We maintain an efficiently-updatable append-only CRH-based Merkle tree Tree(CMList) over the (growing) list CMList and let rt denote the root of Tree(CMList). It is well-known that rt can be updated to account for the insertion of new leaves with time and space proportional to just the tree depth. Hence, the time and space complexity is reduced from linear in the size of CMList to logarithmic. With this in mind, we modify the NP statement to the following one: “I know r such that COMMr(sn) appears as a leaf in a CRH-based Merkle tree whose root is rt. Compared with the naive data structure for CMList, this modification increases exponentially the size of CMList that a given zk-SNARK implementation can support. (Concretely: using Merkle trees of depth 64, Zerocash supports 264 coins.)

Step 3: extending coins for direct anonymous payments. So far, the coin commitment

cm of a coin c is a commitment to the coin’s serial number sn. However, this creates a problem when transferring c to another user. Indeed, suppose that a user uA created c, and uA sends c to another user uB. First, since uA knows sn, the spending of c by uB is both non-anonymous (since uA sees when c is spent, by recognizing sn) and risky (since uA could still spend c first). Thus, uB must immediately spend c and mint a new coin cj to protect himself. Second, if uA in fact wants to transfer to uB, e.g., 100 BTC, then doing so is both unwieldy (since it requires 100 transfers) and non-anonymous (since the amount of the transfer is leaked). And third, transfers in amounts that are not multiples of 1 BTC (the fixed value of a coin) are not supported. Thus, the simplified construction described is inadequate as a payment scheme.

We address this by modifying the derivation of a coin commitment, and using pseudorandom functions to target payments and to derive serial numbers, as follows. We use three pseudorandom

functions (derived from a single one). For a seed x, these are denoted PRFaddr(·), PRFsn(·), and

x x

PRFpk(·). We assume that PRFsn is moreover collision-resistant.

x

To provide targets for payments, we use addresses: each user u generates an address key pair (apk, ask), the address public key and address private key respectively. The coins of u contain the value apk and can be spent only with knowledge of ask. A key pair (apk, ask) is sampled by selecting a random seed ask and setting apk := PRFaddr(0). A user can generate and use any number of

ask

address key pairs.

Next, we redesign minting to allow for greater functionality. To mint a coin c of a desired value v, the user u first samples ρ, which is a secret value that determines the coin’s serial number as sn := PRFsn (ρ). Then, u commits to the tuple (apk, v, ρ) in two phases: (a) u computes k := COMMr(apkρ) for a random r; and then (b) u computes cm := COMMs(vk) for a random s. The minting results in a coin c := (apk, v, ρ, r, s, cm) and a mint transaction txMint := (v, k, s, cm). Crucially, due to the nested commitment, anyone can verify that cm in txMint is a coin commitment of a coin of value v (by checking that COMMs(vk) equals cm) but cannot discern the owner (by learning the address key apk) or serial number (derived from ρ) because these are hidden in k. As before, txMint is accepted by the ledger only if u deposits the correct amount, in this case v BTC.

ask

Coins are spent using the pour operation, which takes a set of input coins, to be consumed, and “pours” their value into a set of fresh output coins — such that the total value of output coins equals

the total value of the input coins. Suppose that u, with address key pair (aold, aold), wishes to consume

pk sk

his coin cold = (aold, vold, ρold, rold, sold, cmold) and produce two new coins cnew and cnew, with total

pk 1 2

value vnew + vnew = vold, respectively targeted at address public keys anew and anew . (The addresses

1 2 pk,1 pk,2

and a

new pk,1

a

new pk,2

may belong to u or to some other user.) The user u, for each i ∈ {1, 2}, proceeds as

follows: (i) u samples serial number randomness ρnew; (ii) u computes knew := COMMrnew (anewρnew)

i

i

i

pk,i

i

for a random rnew; and (iii) u computes cmnew := COMMsnew (vnewknew) for a random snew.

i

i

i

i

i

i

This yields the coins cnew := (anew , vnew, ρnew, rnew, snew, cmnew) and cnew := (anew , vnew, ρnew,

1 pk,1 1

1 1 1 1

2 pk,2 2 2

rnew, snew, cmnew). Next, u produces a zk-SNARK proof πPOUR for the following NP statement, which

2 2 2

we call POUR:

“Given the Merkle-tree root rt, serial number snold, and coin commitments cmnew, cmnew, I

1 2

know coins cold, cnew, cnew, and address secret key aold such that:

1 2 sk

      • The coins are well-formed: for cold it holds that kold = COMMrold (aoldρold) and cmold =

pk

COMMsold (voldkold); and similarly for cnew and cnew.

1

2

      • The address secret key matches the public key: aold = PRFaddr(0).

a

pk

      • The serial number is computed correctly: snold := PRFsn

sk

aold

old sk

(ρold).

      • The coin commitment cmold appears as a leaf of a Merkle-tree with root rt.
      • The values add up: vnew + vnew = vold.”

1

2

A resulting pour transaction txPour := (rt, snold, cmnew, cmnew, πPOUR) is appended to the ledger.

1 2

(As before, the transaction is rejected if the serial number sn appears in a previous transaction.)

Now suppose that u does not know, say, the address secret key anew that is associated with the public key anew . Then, u cannot spend cnew because he cannot provide anew as part of the witness

sk,1

pk,1 1 sk,1

of a subsequent pour operation. Furthermore, when a user who knows anew does spend cnew, the

sk,1 1

user u cannot track it, because he knows no information about its revealed serial number, which is

new sn new

sn := PRF new (ρ ).

1 a 1

sk,1

Also observe that txPour reveals no information about how the value of the consumed coin was

divided among the two new fresh coins, nor which coin commitment corresponds to the consumed coin, nor the address public keys to which the two new fresh coins are targeted. The payment was conducted in full anonymity.

More generally, a user may pour N old 0 coins into N new 0 coins. For simplicity we consider the case N old = N new = 2, without loss of generality. Indeed, for N old < 2, the user can mint a coin with value 0 and then provide it as a “null” input, and for N new < 2, the user can create (and discard) a new coin with value 0. For N old > 2 or N new > 2, the user can compose log N old + log N new of the 2-input/2-output pours.

≥ ≥

Step 4: sending coins. Suppose that anew is the address public key of u1. In order to allow u1

pk,1

to actually spend the new coin cnew produced above, u must somehow send the secret values in cnew to u1. One way is for u to send u1 a private message, but the requisite private communication channel necessitates additional infrastructure or assumptions. We avoid this “out-of-band” channel and instead build this capability directly into our construction by leveraging the ledger as follows. We modify the structure of an address key pair. Each user now has a key pair (addrpk, addrsk), where addrpk = (apk, pkenc) and addrsk = (ask, skenc). The values (apk, ask) are generated as before.

1

1

In addition, (pkenc, skenc) is a key pair for a key-private encryption scheme [BBDP01].

Then, u computes the ciphertext C1 that is the encryption of the plaintext (vnew, ρnew, rnew, snew),

1 1 1 1

under pknew (which is part of u1’s address public key addrnew), and includes C1 in the pour

enc,1

sk,1

transaction txPour. The user u1 can then find and decrypt this message (using his sknew

enc,1

) by

scanning the pour transactions on the public ledger. Again, note that adding C1 to txPour leaks neither paid amounts, nor target addresses due to the key-private property of the encryption scheme. (The user u does the same with cnew and includes a corresponding ciphertext C2 in txPour.)

2

Step 5: public outputs. The construction so far allows users to mint, merge, and split coins. But how can a user redeem one of his coins, i.e., convert it back to the base currency (Bitcoin)? For this, we modify the pour operation to include a public output. When spending a coin, the user u also specifies a nonnegative vpub and a transaction string info 0, 1 . The balance equation in the NP statement POUR is changed accordingly: “vnew + vnew + vpub = vold”. Thus, of the input

∈ { }

1 2

value vold, a part vpub is publicly declared, and its target is specified, somehow, by the string info. The string info can be used to specify the destination of these redeemed funds (e.g., a Bitcoin wallet public key).5 Both vpub and info are now included in the resulting pour transaction txPour. (The public output is optional, as the user u can set vpub = 0.)

Step 6: non-malleability. To prevent malleability attacks on a pour transaction txPour (e.g., embezzlement by re-targeting the public output of the pour by modifying info), we further modify the NP statement POUR and use digital signatures. Specifically, during the pour operation, the user u

(i) samples a key pair (pksig, sksig) for a one-time signature scheme; (ii) computes hSig := CRH(pksig);

  1. computes the two values h1 := PRFpk

aold

sk,1

(hSig) and h2 := PRFpk

sk,2

aold

(hSig), which act as MACs to

5These public outputs can be considered as an “input” to a Bitcoin-style transaction, where the string info contains the Bitcoin output scripts. This mechanism also allows us to support Bitcoin’s public transaction fees.

“tie” hSig to both address secret keys; (iv) modifies POUR to include the three values hSig, h1, h2 and prove that the latter two are computed correctly; and (v) uses sksig to sign every value associated with the pour operation, thus obtaining a signature σ, which is included, along with pksig, in txPour.

Since the aold are secret, and with high probability hSig changes for each pour transaction, the

sk,i

values h1, h2 are unpredictable. Moreover, the signature on the NP statement (and other values) binds all of these together, as argued in more detail in Appendix C and Appendix D.

This ends the outline of the construction, which is summarized in part in Figure 1. We conclude by noting that, due to the zk-SNARK, our construction requires a one-time trusted setup of public parameters. The soundness of the proofs depends on this trust, though anonymity continues to hold even if the setup is corrupted by a malicious party.

    1. Merke tree over (cm1,cm2,…)

rt

    1. coin

c = ((a

,pk

), v, ρ, r, s, cm)

rt = Merkle-tree root

cm = coin commitment

pk enc

CRH

CRH

CRH

sn = serial number

    1. coin commitment

cm

COMM

s

PRFsn

asfi

v

COMM

r

ρ

 

CRH

    1. serial number

sn

v = coin value

r, s = commitment rand.

ρ = serial number rand. (apk,pkenc) = address public key (ask,skenc) = address secret key

 

CRH

CRH

 

CRH

CRH

CRH

CRH

 

cm1 cm2 cm3 cm4 cm5 cm6 cm7 cm8 …

apfi 0

Figure 1: (a) Illustration of the CRH-based Merkle tree over the list CMList of coin commitments. (b) A coin c. (c) Illustration of the structure of a coin commitment cm. (d) Illustration of the structure of a coin serial number sn.

sfi

PRFaddr

a

Zerocash

We outline Zerocash, a concrete implementation, at 128 bits of security, of our DAP scheme construction; see Section 5 for details. Zerocash entails carefully instantiating the cryptographic ingredients of the construction to ensure that the zk-SNARK, the “heaviest” component, is efficient enough in practice. In the construction, the zk-SNARK is used to prove/verify a specific NP statement: POUR. While zk-SNARKs are asymptotically efficient, their concrete efficiency depends on the arithmetic circuit C that is used to decide the NP statement. Thus, we seek instantiations for which we can design a relatively small arithmetic circuit CPOUR for verifying the NP statement POUR. Our approach is to instantiate all of the necessary cryptographic ingredients (commitment schemes, pseudorandom functions, and collision-resistant hashing) based on SHA256. We first design a hand-optimized circuit for verifying SHA256 computations (or, more precisely, its compression function, which suffices for our purposes).6 Then, we use this circuit to construct CPOUR, which

verifies all the necessary checks for satisfying the NP statement CPOUR.

This, along with judicious parameter choices, and a state-of-the-art implementation of a zk-SNARK for arithmetic circuits [BCTV14] (see Section 2.4), results in a zk-SNARK prover

6Alternatively, we could have opted to rely on the circuit generators [PGHR13, BCG+13, BCTV14], which support various classes of C programs, by writing C code expressing the POUR checks. However, as discussed later, these generic approaches are more expensive than our hand-optimized construction.

running time of a few minutes and zk-SNARK verifier running time of a few milliseconds. This allows the DAP scheme implementation to be practical for deployment, as our experiments show.

Zerocash can be integrated into Bitcoin or forks of it (commonly referred to as “altcoins”); we later describe how this is done.

Paper organization

The remainder of this paper is organized as follows. Section 2 provides background on zk-SNARKs. We define DAP schemes in Section 3, and our construction thereof in Section 4. Section 5 discusses the concrete instantiation in Zerocash. Section 6 describes the integration of Zerocash into existing ledger-based currencies. Section 7 provides microbenchmarks for our prototype implementation, as well as results based on full-network simulations. Section 8 describes optimizations. We discuss concurrent work in Section 9 and summarize our contributions and future directions in Section 10.

Background on zk-SNARKs

The main cryptographic primitive used in this paper is a special kind of Succinct Non-interactive ARgument of Knowledge (SNARK). Concretely, we use a publicly-verifiable preprocessing zero- knowledge SNARK, or zk-SNARK for short. In this section we provide basic background on zk-SNARKs, provide an informal definition, compare zk-SNARKs with the more familiar notion of NIZKs, and recall known constructions and implementations.

Informal definition

We informally define zk-SNARKs for arithmetic circuit satisfiability. We refer the reader to, e.g., [BCI+13] for a formal definition.

For a field F, an F-arithmetic circuit takes inputs that are elements in F, and its gates output

elements in F. We naturally associate a circuit with the function it computes. To model nonde- terminism we consider circuits that have an input x ∈ Fn and an auxiliary input a ∈ Fh, called a witness. The circuits we consider only have bilinear gates. Arithmetic circuit satisfiability is defined analogously to the boolean case, as follows.

7

Deftnition 2.1. The arithmetic circuit satisfiability problem of an F-arithmetic circuit C : Fn × Fh → Fl is captured by the relation RC = {(x, a) ∈ Fn × Fh : C(x, a) = 0l}; its language is LC = {x ∈ Fn : ∃ a ∈ Fh s.t. C(x, a) = 0l}.

Given a field F, a (publicly-verifiable preprocessing) zk-SNARK for F-arithmetic circuit satisfiability is a triple of polynomial-time algorithms (KeyGen, Prove, Verify):

  • KeyGen(1λ, C) → (pk, vk). On input a security parameter λ (presented in unary) and an F- arithmetic circuit C, the key generator KeyGen probabilistically samples a proving key pk and a verification key vk. Both keys are published as public parameters and can be used, any number of times, to prove/verify membership in LC .
  • Prove(pk, x, a) → π. On input a proving key pk and any (x, a) ∈ RC , the prover Prove outputs a non-interactive proof π for the statement x ∈ LC .

7A gate with inputs y1, . . . , ym ∈ F is bilinear if the output is (˙a, (1, y1, . . . , ym)) · (˙b, (1, y1, . . . , ym)) for some

˙a,˙b ∈ Fm+1. These include addition, multiplication, negation, and constant gates.

  • Verify(vk, x, π) → b. On input a verification key vk, an input x, and a proof π, the verifier Verify

outputs b = 1 if he is convinced that x ∈ LC .

A zk-SNARK satisfies the following properties.

Completeness. For every security parameter λ, any F-arithmetic circuit C, and any (x, a) ∈ RC , the honest prover can convince the verifier. Namely, b = 1 with probability 1 − negl(λ) in the following experiment: (pk, vk) ← KeyGen(1λ, C); π ← Prove(pk, x, a); b ← Verify(vk, x, π).

Succinctness. An honestly-generated proof π has Oλ(1) bits and Verify(vk, x, π) runs in time Oλ(|x|). (Here, Oλ hides a fixed polynomial factor in λ.)

Proof of knowledge (and soundness). If the verifier accepts a proof output by a bounded prover, then the prover “knows” a witness for the given instance. (In particular, soundness holds against bounded provers.) Namely, for every poly(λ)-size adversary A, there is a poly(λ)-size extractor E such that Verify(vk, x, π) = 1 and (x, a) ƒ∈ RC with probability negl(λ) in the following experiment: (pk, vk) ← KeyGen(1λ, C); (x, π) ← A(pk, vk); a ← E (pk, vk).

Perfect zero knowledge. An honestly-generated proof is perfect zero knowledge.8 Namely, there is a polynomial-time simulator Sim such that for all stateful distinguishers D the following two probabilities are equal:

 . (pk, vk) ← KeyGen(1λ, C)  

. (pk, vk, trap) ← Sim(1λ, C) 

Pr (x, a) ∈ RC

D(π) = 1 .

(x, a) ← D(pk, vk)

π ← Prove(pk, x, a)

and

Pr (x, a) ∈ RC

D(π) = 1 .

(x, a) ← D(pk, vk) . π ← Sim(trap, x)

(the probability that D(π) = 1 on an honest proof)

(the probability that D(π) = 1 on a simulated proof)

Remark. Both proof of knowledge and zero knowledge are essential to the use of zk-SNARKs in this paper. Indeed, we consider circuits C that verify assertions about cryptographic primitives (such as using a knowledge of SHA256 pre-image as a binding commitment). Thus it does not suffice to merely know that, for a given input x, a witness for x ∈ LC exists. Instead, proof of knowledge ensures that a witness can be efficiently found (by extracting it from the prover) whenever the verifier accepts a proof. As for zero knowledge, it ensures that a proof leaks no information about the witness, beyond the fact that x ∈ LC .

Remark. In the security proofs (see Appendix D), we deal with provers producing a vector of inputs

˙x together with a vector of corresponding proofs ˙π. In such cases, it is convenient to use an extractor that can extract a vector of witnesses ˙a containing a valid witness for each valid proof. This “multi- instance” extraction follows from the “single-instance” one described above [BCCT12, BCCT13]. Namely, if (KeyGen, Prove, Verify) is a zk-SNARK, then for any poly(λ)-size prover adversary A there exists a poly(λ)-size extractor E such that

Pr  ∃ i s.t.

Verify(vk, xi, πi) = 1 (xi, ai) ∈/ RC .

(pk, vk) ← KeyGen(1λ, C)

(˙x, ˙π) ← A(pk, vk) ≤ negl(λ) .

˙a ← E (pk, vk)

Comparison with NIZKs

zk-SNARKs are related to a familiar cryptographic primitive: non-interactive zero-knowledge proofs of knowledge (NIZKs). Both zk-SNARKs and NIZKs require a one-time trusted setup of public

8While most zk-SNARK descriptions in the literature only mention statistical zero knowledge, all zk-SNARK constructions can be made perfect zero knowledge by allowing for a negligible error probability in completeness.

parameters (proving and verification keys for zk-SNARKs, and a common reference string for NIZKs). Both provide the same guarantees of completeness, proof of knowledge, and zero knowledge. The difference lies in efficiency guarantees. In a NIZK, the proof length and verification time depend on the NP language being proved. For instance, for the language of circuit satisfiability, the proof length and verification time in [GOS06b, GOS06a] are linear in the circuit size. Conversely, in a zk-SNARK, proof length depends only on the security parameter, and verification time depends only on the instance size (and security parameter) but not on the circuit or witness size.

Thus, zk-SNARKs can be thought of as “succinct NIZKs”, having short proofs and fast verifica- tion times. Succinctness comes with a caveat: known zk-SNARK constructions rely on stronger assumptions than NIZKs do (see below).

Known constructions and security

There are many zk-SNARK constructions in the literature [Gro10, Lip12, BCI+13, GGPR13, PGHR13, BCG+13, Lip13, BCTV14]. We are interested in zk-SNARKs for arithmetic circuit satisfiability, and the most efficient ones for this language are based on quadratic arithmetic programs [GGPR13, BCI+13, PGHR13, BCG+13, BCTV14]; such constructions provide a linear- time KeyGen, quasilinear-time Prove, and linear-time Verify.

Security of zk-SNARKs is based on knowledge-of-exponent assumptions and variants of Diffie– Hellman assumptions in bilinear groups [Gro10, BB04, Gen04]. While knowledge-of-exponent assumptions are fairly strong, there is evidence that such assumptions may be inherent for con- structing zk-SNARKs [GW11, BCCT12].

Remark (fully-succinct zk-SNARKs). The key generator KeyGen takes a circuit C as input. Thus, KeyGen’s running time is at least linear in the size of the circuit C. One could require KeyGen to not have to take C as input, and have its output keys work for all (polynomial-size) circuits C. In such a case, KeyGen’s running time would be independent of C. A zk-SNARK satisfying this stronger property is fully succinct. Theoretical constructions of fully-succinct zk-SNARKs are known, based on various cryptographic assumptions [Mic00, Val08, BCCT13]. Despite achieving essentially-optimal asymptotics [BFLS91, BGH+05, BCGT13b, BCGT13a, BCCT13] no implementations of them have been reported in the literature to date.

zk-SNARK implementations

There are three published implementations of zk-SNARKs: (i) Parno et al. [PGHR13] present an implementation of zk-SNARKs for programs having no data dependencies;9 (ii) Ben-Sasson et al. [BCG+13] present an implementation of zk-SNARKs for arbitrary programs (with data dependencies); and (iii) Ben-Sasson et al. [BCTV14] present an implementation of zk-SNARKs that supports programs that modify their own code (e.g., for runtime code generation); their implementation also reduces costs for programs of larger size and allows for universal key pairs.

Each of the works above also achieves zk-SNARKs for arithmetic circuit satisfiability as a stepping stone towards their respective higher-level efforts. In this paper we are only interested in a zk-SNARK for arithmetic circuit satisfiability, and we rely on the implementation of [BCTV14] for such a zk-SNARK.10 The implementation in [BCTV14] is itself based on the protocol of Parno et al. [PGHR13]. We thus refer the interested reader to [PGHR13] for details of the protocol, its

9They only support programs where array indices are restricted to be known compile-time constants; similarly, loop iteration counts (or at least upper bounds to these) must be known at compile time.

10In [BCTV14], one optimization to the verifier’s runtime requires preprocessing the verification key vk; for simplicity,

we do not use this optimization.

intuition, and its proof of security; and to [BCTV14] for the implementation and its performance. In terms of concrete parameters, the implementation of [BCTV14] provides 128 bits of security, and the field F is of a 256-bit prime order p.

Definition of a decentralized anonymous payment scheme

We introduce the notion of a decentralized anonymous payment scheme (DAP scheme), extending the notion of decentralized e-cash [MGGR13]. Later, in Section 4, we provide a construction.

Data structures

We begin by describing, and giving intuition about, the data structures used by a DAP scheme. The algorithms that use and produce these data structures are introduced in Section 3.2.

Basecoin ledger. Our protocol is applied on top of a ledger-based base currency such as Bitcoin; for generality we refer to this base currency as Basecoin. At any given time T , all users have access to LT , the ledger at time T , which is a sequence of transactions. The ledger is append-only (i.e., T < T j implies that LT is a prefix of LT j ).11 The transactions in the ledger include both Basecoin transactions as well as two new transaction types described below.

Public parameters. A list of public parameters pp is available to all users in the system. These are generated by a trusted party at the “start of time” and are used by the system’s algorithms.

Addresses. Each user generates at least one address key pair (addrpk, addrsk). The public key addrpk is published and enables others to direct payments to the user. The secret key addrsk is used to receive payments sent to addrpk. A user may generate any number of address key pairs.

Coins. A coin is a data object c, to which we associate the following.

  • A coin commitment, denoted cm(c): a string that appears on the ledger once c is minted.
  • A coin value, denoted v(c): the denomination of c, as measured in basecoins, as an integer between 0 and a maximum value vmax (which is a system parameter).
  • A coin serial number, denoted sn(c): a unique string associated with c, used to prevent double spending.
  • A coin address, denoted addrpk(c): an address public key, representing who owns c.

Any other quantities associated with a coin c (e.g., various trapdoors) are implementation details.

New transactions. Besides Basecoin transactions, there are two new types of transactions.

  • Mint transactions. A mint transaction txMint is a tuple (cm, v, ∗), where cm is a coin commitment, v is a coin value, and ∗ denotes other (implementation-dependent) information. The transaction txMint records that a coin c with coin commitment cm and value v has been minted.
  • Pour transactions. A pour transaction txPour is a tuple (rt, snold, snold, cmnew, cmnew, vpub, info, ∗),

1

2

1

2

where rt is a root of a Merkle tree, snold, snold are two coin serial numbers, cmnew, cmnew are

1 2 1 2

two coin commitments, vpub is a coin value, info is an arbitrary string, and ∗ denotes other

(implementation-dependent) information. The transaction txPour records the pouring of two input (and now consumed) coins cold, cold, with respective serial numbers snold, snold, into two

1 2 1 2

new output coins cnew, cnew, with respective coin commitments cmnew, cmnew, as well as a public

1 2 1 2

output vpub (which may be zero). Furthermore, txPour also records an information string info

(perhaps containing information on who is the recipient of vpub basecoins) and that, when this transaction was made, the root of the Merkle tree over coin commitments was rt (see below).

11In reality, the Basecoin ledger (such as the one of Bitcoin) is not perfect and may incur temporary inconsistencies. In this respect our construction is as good as the underlying ledger. We discuss the effects of this on anonymity and mitigations in Section 6.4.

Commitments of minted coins and serial numbers of spent coins. For any given time T ,

  • CMListT denotes the list of all coin commitments appearing in mint and pour transactions in LT ;
  • SNListT denotes the list of all serial numbers appearing in pour transactions in LT .

While both of these lists can be deduced from LT , it will be convenient to think about them as separate (as, in practice, these may be separately maintained for efficiency reasons; cf. Section 8.3).

Merkle tree over commitments. For any given time T , TreeT denotes a Merkle tree over CMListT and rtT its root. Moreover, the function PathT (cm) gives the authentication path from a coin commitment cm appearing in CMListT to the root of TreeT .12 For convenience, we assume that LT also stores rtT j for all T jT (i.e., it stores all past Merkle tree roots).

Algorithms

A DAP scheme Π is a tuple of polynomial-time algorithms

(Setup, CreateAddress, Mint, Pour, VerifyTransaction, Receive) with the following syntax and semantics.

System setup. The algorithm Setup generates a list of public parameters:

Setup

      • inputs: security parameter λ
      • outputs: public parameters pp

The algorithm Setup is executed by a trusted party. The resulting public parameters pp are published and made available to all parties (e.g., by embedding them into the protocol’s implementation). The setup is done only once; afterwards, no trusted party is needed, and no global secrets or trapdoors are kept.

Creating payment addresses. The algorithm CreateAddress generates a new address key pair:

CreateAddress

      • inputs: public parameters pp
      • outputs: address key pair (addrpk, addrsk)

Each user generates at least one address key pair (addrpk, addrsk) in order to receive coins. The public key addrpk is published, while the secret key addrsk is used to redeem coins sent to addrpk. A user may generate any number of address key pairs; doing so does not require any interaction.

Minting coins. The algorithm Mint generates a coin (of a given value) and a mint transaction:

Mint

      • inputs:
        • public parameters pp
        • coin value v ∈ {0, 1, . . . , vmax}
        • destination address public key addrpk
      • outputs: coin c and mint transaction txMint

A system parameter, vmax, caps the value of any single coin. The output coin c has value v and coin address addrpk; the output mint transaction txMint equals (cm, v, ∗), where cm is the coin commitment of c.

12While we refer to Mekle trees for simplicity, it is straightforward to extend the definition to allow other data structures representing sets with fast insertion and efficient proofs of membership.

Pouring coins. The Pour algorithm transfers value from input coins into new output coins, marking the input coins as consumed. Moreover, a fraction of the input value may be publicly revealed. Pouring allows users to subdivide coins into smaller denominations, merge coins, and transfer ownership of anonymous coins, or make public payments.13

Pour

      • inputs:
        • public parameters pp
        • the Merkle root rt
        • old coins cold, cold

1 2

        • old addresses secret keys addrold , addrold

sk,1 sk,2

        • authentication path path1 from commitment cm(cold) to root rt, authentication path path2 from commitment cm(cold) to root rt

1

2

        • new values vnew, vnew

1 2

        • new addresses public keys addrnew , addrnew
        • public value vpub
        • transaction string info

pk,1

pk,2

      • outputs: new coins cnew, cnew and pour transaction txPour

1

2

Thus, the Pour algorithm takes as input two distinct input coins cold, cold, along with corresponding

1 2

address secret keys addrold , addrold

(required to redeem the two input coins). To ensure that

sk,1 sk,2

cold, cold have been previously minted, the Pour algorithm also takes as input the Merkle root rt

1 2

(allegedly, equal to the root of Merkle tree over all coin commitments so far), along with two

authentication paths path1, path2 for the two coin commitments cm(cold), cm(cold). Two input values

1 2

vnew, vnew specify the values of two new anonymous coins cnew, cnew to be generated, and two input

1 2 1 2

address public keys addrnew , addrnew specify the recipients of cnew, cnew. A third value, vpub, specifies

pk,1 pk,2 1 2

the amount to be publicly spent (e.g., to redeem coins or pay transaction fees). The sum of output

values vnew + vnew + vpub must be equal to the sum of the values of the input coins (and cannot

1 2

exceed vmax). Finally, the Pour algorithm also receives an arbitrary string info, which is bound into

the output pour transaction txPour.

The Pour algorithm outputs two new coins cnew, cnew and a pour transaction txPour. The

1 2

transaction txPour equals (rt, snold, snold, cmnew, cmnew, vpub, info, ∗), where cmnew, cmnew are the two

1

2

1

2

1

2

coin commitments of the two output coins, and ∗ denotes other (implementation-dependent) information. Crucially, txPour reveals only one value, the public value vpub (which may be zero); it does not reveal the payment addresses or values of the old or new coins.

Verifying transactions. The algorithm VerifyTransaction checks the validity of a transaction:

VerifyTransaction

      • inputs:
        • public parameters pp
        • a (mint or pour) transaction tx
        • the current ledger L
      • outputs: bit b, equals 1 iff the transaction is valid

Both mint and pour transactions must be verified before being considered well-formed. In practice, transactions can be verified by the nodes in the distributed system maintaining the ledger, as well

13We consider pours with 2 inputs and 2 outputs, for simplicity and (as discussed in Section 1.3) without loss of generality.

as by users who rely on these transactions.

Receiving coins. The algorithm Receive scans the ledger and retrieves unspent coins paid to a particular user address:

Receive

      • inputs:
        • recipient address key pair (addrpk, addrsk)
        • the current ledger L
      • outputs: set of (unspent) received coins

When a user with address key pair (addrpk, addrsk) wishes to receive payments sent to addrpk, he uses the Receive algorithm to scan the ledger. For each payment to addrpk appearing in the ledger, Receive outputs the corresponding coins whose serial numbers do not appear on the ledger L. Coins received in this way may be spent, just like minted coins, using the Pour algorithm. (We only require Receive to detect coins paid to addrpk via the Pour algorithm and not also detect coins minted by the user himself.)

Next, we describe completeness (Section 3.3) and security (Section 3.4).

Completeness

Completeness of a DAP scheme requires that unspent coins can be spent. More precisely, consider a ledger sampler S outputting a ledger L. If c1 and c2 are two coins whose coin commitments appear in (valid) transactions on L, but their serial numbers do not appear in L, then c1 and c2 can be spent using Pour. Namely, running Pour results in a pour transaction txPour that VerifyTransaction accepts, and the new coins can be received by the intended recipients (by using Receive); moreover, txPour correctly records the intended vpub and transaction string info. This property is formalized via an incompleteness experiment INCOMP.

Deftnition 3.1. A DAP scheme Π = (Setup, CreateAddress, Mint, Pour, VerifyTransaction, Receive) is complete if no polynomial-size ledger sampler S wins INCOMP with more than negligible probability. (See Appendix B for details.)

Security

Security of a DAP scheme is characterized by three properties, which we call ledger indistinguishability, transaction non-malleability, and balance.

Deftnition 3.2. A DAP scheme Π = (Setup, CreateAddress, Mint, Pour, VerifyTransaction, Receive) is secure if it satisfies ledger indistinguishability, transaction non-malleability, and balance.

Below, we provide an informal overview of each property, and defer formal definitions to Appendix C.

Each property is formalized as a game between an adversary A and a challenger C. In each game, the behavior of honest parties is realized via a DAP scheme oracle ODAP, which maintains a ledger L and provides an interface for executing CreateAddress, Mint, Pour and Receive algorithms for honest parties. To elicit behavior from honest parties, A passes a query to C, which (after sanity checks) proxies the query to ODAP. For each query that requests an honest party to perform an action, A specifies identities of previous transactions and the input values, and learns the resulting transaction, but not any of the secrets or trapdoors involved in producing that transaction. The oracle ODAP also provides an Insert query that allows A to directly add aribtrary transactions to the ledger L.

Ledger indistinguishability. This property captures the requirement that the ledger reveals no new information to the adversary beyond the publicly-revealed information (values of minted coins, public values, information strings, total number of transactions, etc.), even when the adversary can adaptively induce honest parties to perform DAP operations of his choice. That is, no bounded adversary A can distinguish between two ledgers L0 and L1, constructed by A using queries to two DAP scheme oracles, when the queries to the two oracles are publicly consistent : they have matching type and are identical in terms of publicly-revealed information and the information related to addresses controlled by A.

Ledger indistinguishability is formalized by an experiment L-IND that proceeds as follows.

First, a challenger samples a random bit b and initializes two DAP scheme oracles ODAP and

0

ODAP, maintaining ledgers L0 and L1. Throughout, the challenger allows A to issue queries to

1

ODAP and ODAP, thus controlling the behavior of honest parties on L0 and L1. The challenger

0

1

provides the adversary with the view of both ledgers, but in randomized order: LLeft := Lb and LRight := L1−b. The adversary’s goal is to distinguish whether the view he sees corresponds to (LLeft, LRight) = (L0, L1), i.e. b = 0, or to (LLeft, LRight) = (L1, L0), i.e. b = 1.

At each round of the experiment, the adversary issues queries in pairs Q, Qj of matching query type. If the query type is CreateAddress, then the same address is generated at both oracles. If it is to Mint, Pour or Receive, then Q is forwarded to L0 and Qj to L1; for Insert queries, query Q is forwarded to LLeft and Qj is forwarded to LRight. The adversary’s queries are restricted in the sense that they must maintain the public consistency of the two ledgers. For example, the public values for Pour queries must be the same, as well as minted amounts for Mint queries.

At the conclusion of the experiment, A outputs a guess bj, and wins when b = bj. Ledger indistinguishability requires that A wins L-IND with probability at most negligibly greater than 1/2.

Transaction non-malleability. This property requires that no bounded adversary A can alter any of the data stored within a (valid) pour transaction txPour. This transaction non-malleability prevents malicious attackers from modifying others’ transactions before they are added to the ledger (e.g., by re-targeting the Basecoin public output of a pour transaction).

Transaction non-malleability is formalized by an experiment TR-NM, in which A adaptively interacts with a DAP scheme oracle ODAP and then outputs a pour transaction tx. Letting T denote the set of pour transactions returned by ODAP, and L denote the final ledger, A wins the game if there exists tx ∈ T , such that (i) tx ƒ= tx; (ii) tx reveals a serial number contained in tx; and (iii) both tx and tx are valid with respect to the ledger L containing all transactions preceding tx on L. In other words, A wins the game if tx manages to modify some previous pour transaction to spend the same coin in a different way.

∗ j

Transaction non-malleability requires that A wins TR-NM with only negligible probability. (Note that A can of course produce valid pour transactions that are unrelated to those in T ; the condition that tx reveals a serial number of a previously-spent coin captures non-malleability.)

Balance. This property requires that no bounded adversary A can own more money than what he minted or received via payments from others.

Balance is formalized by an experiment BAL, in which A adaptively interacts with a DAP scheme oracle ODAP and then outputs a set of coins Scoin. Letting ADDR be set of addresses returned by CreateAddress queries (i.e., addresses of “honest” users), A wins the game if the total value he can spend or has spent (either as coins or Basecoin public outputs) is greater than the value he has minted or received. That is, A wins if vUnspent + vBasecoin + vA→ADDR > vMint + vADDR→A where:

  1. vUnspent is the total value of unspent coins in Scoin; (ii) vBasecoin is the total value of public outputs placed by A on the ledger; (iii) vMint is the total value of A’s mint transactions; (iv) vADDR→A is the total value of payments received by A from addresses in ADDR; (v) vA→ADDR is the total value

of payments sent by A to addresses in ADDR.

Balance requires that A wins BAL with only negligible probability.

Construction of a decentralized anonymous payment scheme

We show how to construct a DAP scheme (introduced in Section 3) using zk-SNARKs and other building blocks. Later, in Section 5, we give a concrete instantiation of this construction.

Cryptographic building blocks

We first introduce notation for the standard cryptographic building blocks that we use. We assume familiarity with the definitions of these building blocks; for more details, see, e.g., [KL07]. Throughout, λ denotes the security parameter.

Collision-resistant hashing. We use a collision-resistant hash function CRH : {0, 1}∗ → {0, 1}O(λ).

Pseudorandom functions. We use a pseudorandom function family PRF = {PRFx : {0, 1}∗ →

{0, 1}O(λ)}x where x denotes the seed. From PRFx, we derive three “non-overlapping” pseudorandom functions, chosen arbitrarily as PRFaddr(z) := PRFx(00″z) , PRFsn(z) := PRFx(01″z) , PRFpk(z) :=

x

x

x

PRFx(10″z). Furthermore, we assume that PRFsn is also collision resistant, in the sense that it is infeasible to find (x, z) ƒ= (xj, zj) such that PRFsn(z) = PRFsn(zj).

x

xj

Statistically-hiding commitments. We use a commitment scheme COMM where the bind- ing property holds computationally, while the hiding property holds statistically. It is denoted

{COMMx : {0, 1}∗ → {0, 1}O(λ)}x where x denotes the commitment trapdoor. Namely, to reveal a

commitment cm to a value z, it suffices to provide z and the trapdoor x; then one can check that

cm = COMMx(z).

One-time strongly-unforgeable digital signatures. We use a digital signature scheme Sig = (Gsig, Ksig, Ssig, Vsig) that works as follows.

  • G sig(1λ) → ppsig. Given a security parameter λ (presented in unary), Gsig samples public parameters

ppsig for the encryption scheme.

  • K sig(ppsig) → (pksig, sksig). Given public parameters ppsig, Ksig samples a public key and a secret key for a single user.
  • S sig(sksig, m) → σ. Given a secret key sksig and a message m, Ssig signs m to obtain a signature σ.
  • V sig(pksig, m, σ) → b. Given a public key pksig, message m, and signature σ, Vsig outputs b = 1 if the signature σ is valid for message m; else it outputs b = 0.

The signature scheme Sig satisfies the security property of one-time strong unforgeability against chosen-message attacks (SUF-1CMA security).

Key-private public-key encryption. We use a public-key encryption scheme Enc = (Genc, Kenc,

Eenc, Denc) that works as follows.

  • G enc(1λ) → ppenc. Given a security parameter λ (presented in unary), Genc samples public parameters ppenc for the encryption scheme.
  • K enc(ppenc) → (pkenc, skenc). Given public parameters ppenc, Kenc samples a public key and a secret key for a single user.
  • E enc(pkenc, m) → c. Given a public key pkenc and a message m, Eenc encrypts m to obtain a ciphertext c.
  • D enc(skenc, c) → m. Given a secret key skenc and a ciphertext c, Denc decrypts c to produce a message m (or ⊥ if decryption fails).

The encryption scheme Enc satisfies two security properties: (i) ciphertext indistinguishability under chosen-ciphertext attack (IND-CCA security); and (ii) key indistinguishability under chosen-ciphertext

attack (IK-CCA security). While the first property is standard, the second is less known; informally, IK-CCA requires that ciphertexts cannot be linked to the public key used to encrypt them, or to other ciphertexts encrypted with the same public key. For definitions, we refer the reader to [BBDP01].

zk-SNARKs for pouring coins

As outlined in Section 1.3, our construction invokes a zk-SNARK for a specific NP statement, POUR, which we now define. We first recall the context motivating POUR. When a user u pours “old” coins cold, cold into new coins cnew, cnew, a corresponding pour transaction

1 2 1 2

txPour = (rt, snold, snold, cmnew, cmnew, vpub, info, ∗)

1

2

1

2

is generated. In our construction, we need to provide evidence in “∗” that various conditions were respected by the pour operation. Concretely, txPour should demonstrate that (i) u owns cold, cold; (ii) coin commitments for cold, cold appear somewhere on the ledger; (iii) the revealed

1 2 1 2

serial numbers snold, snold are of cold, cold; (iv) the revealed coin commitments cmnew, cmnew

are

1 2 1 2 1 2

of cnew, cnew; (v) balance is preserved. Our construction achieves this by including a zk-SNARK

1 2

proof πPOUR for the statement POUR which checks the above invariants (as well as others needed for

non-malleability).

The statement POUR. Concretely, the NP statement POUR is defined as follows.

  • Instances are of the form x = (rt, snold, snold, cmnew, cmnew, vpub, hSig, h1, h2). Thus, an instance x

1

2

1

2

specifies a root rt for a CRH-based Merkle tree (over the list of commitments so far), the two serial numbers of the consumed coins, two coin commitments for the two new coins, a public value, and fields hSig, h1, h2 used for non-malleability.

  • Witnesses are of the form a = (path1, path2, cold, cold, addrold , addrold , cnew, cnew) where, for each

i ∈ {1, 2}:

1 2 sk,1

sk,2 1 2

cold = (addrold , vold, ρold, rold, sold, cmold) ,

i pk,i i i i i i

cnew = (addrnew, vnew, ρnew, rnew, snew, cmnew) for the same cmnew as in x,

i pk,i i i i i i i

addrold = (aold , pkold ) ,

pk,i

pk,i

enc,i

addrnew = (anew, pknew ) ,

pk,i

pk,i

enc,i

addrold = (aold , skold ) .

sk,i

sk,i

enc,i

Thus, a witness a specifies authentication paths for the two new coin commitments, the entirety of coin information about both the old and new coins, and address secret keys for the old coins.

Given a POUR instance x, a witness a is valid for x if the following holds: 1.For each i ∈ {1, 2}:

  1. The coin commitment cmold of cold appears on the ledger, i.e., pathi is a valid authentication

i i

path for leaf cmold with respect to root rt, in a CRH-based Merkle tree.

i

a

  1. The address secret key aold

matches the address public key of cold, i.e., aold

= PRFaddr(0).

sk,i

i pk,i

old sk,i

  1. The serial number snold of cold is computed correctly, i.e., snold = PRFsn (ρold).

i i i

a

old i

sk,i

  1. The coin cold is well-formed, i.e., cmold = COMM old (COMM old (aoldρold)”vold).

i

i

si

ri

pk,i

i

i

  1. The coin cnew is well-formed, i.e., cmnew = COMMsnew (COMMrnew (anewρnew)”vnew).

i

i

i

i

pk,i

i

i

  1. The address secret key aold

sk,i

ties hSig to hi, i.e., hi = PRFpk

sk,i

aold

(ihSig).

2. Balance is preserved: vnew + vnew + vpub = vold + vold (with vold, vold ≥ 0 and vold + voldvmax).

1

2

1

2

1

2

1

2

Recall that in this paper zk-SNARKs are relative to the language of arithmetic circuit satisfiability (see Section 2); thus, we express the checks in POUR via an arithmetic circuit, denoted CPOUR. In particular, the depth dtree of the Merkle tree needs to be hardcoded in CPOUR, and we thus make it a parameter of our construction (see below); the maximum number of supported coins is then 2dtree .

Algorithm constructions

We proceed to describe the construction of the DAP scheme Π = (Setup, CreateAddress, Mint, Pour, VerifyTransaction, Receive) whose intuition was given in Section 1.3. Figure 2 gives the pseudocode for each one of the six algorithms in Π, in terms of the building blocks introduced in Section 4.1 and Section 4.2. In the construction, we hardcode two quantities: the maximum value of a coin, vmax, and the depth of the Merkle tree, dtree.

Completeness and security

Our main theorem states that the above construction is indeed a DAP scheme.

Theorem 4.1. The tuple Π = (Setup, CreateAddress, Mint, Pour, VerifyTransaction, Receive), as de- fined in Section 4.3, is a complete (cf. Definition 3.1) and secure (cf. Definition 3.2) DAP scheme.

We provide a proof of Theorem 4.1 in Appendix D. We note that our construction can be modified to yield statistical (i.e., everlasting) anonymity; see the discussion in Section 8.1.

Remark (trusted setup). Security of Π relies on a trusted party running Setup to generate the public parameters (once and for all). This trust is needed for the transaction non-malleability and balance properties but not for ledger indistinguishability. Thus, even if a powerful espionage agency were to corrupt the setup, anonymity will still be maintained. Moreover, if one wishes to mitigate the trust requirements of this step, one can conduct the computation of Setup using secure multiparty computation techniques; we leave this to future work.

Remark (use of pp). According to the definition of a DAP scheme (see Section 3), the public parameters pp are given as input to each one of the six algorithms; this is also how we presented our construction in Figure 2. However, in our construction, the public parameters pp equal a tuple (pkPOUR, vkPOUR, ppenc, ppsig), and not every algorithm needs every component of pp. Concretely, CreateAddress only needs ppenc; Mint only the security parameter λ; Pour only pkPOUR and ppsig; VerifyTransaction only vkPOUR; and Receive only λ. In particular, since we rely on zk-SNARKs to prove/verify POUR, pkPOUR is of constant, but large, size, and is only required by Pour. All other components of pp are of small constant size.

Remark (checking received coins in ledger). The algorithm Receive tests whether the serial number of a received coin already appears on the ledger, in order not to output coins that the user has already received and spent by himself. Other users are, in any case, unable to spend coins addressed to this user.

Zerocash

We describe a concrete instantiation of a DAP scheme; this instantiation forms the basis of Zerocash. Later, in Section 6, we discuss how Zerocash can be integrated with existing ledger-based currencies.

Setup

  • inputs: security parameter λ
  • outputs: public parameters pp
  1. Construct CPOUR for POUR at security λ.

λ

Pour

  • inputs:
    • public parameters pp
    • the Merkle root rt
    • old coins cold, cold
  1. Compute ( pkPOUR, vkPOUR) := KeyGen(1 , CPOUR).

1 2

old

old

  1. Compute pp

:= Genc(1λ).

  • old addresses secret keys addrsk,1, addrsk,2

enc

λ

path path from commitment cm(cold) to root rt,

  1. Compute ppsig := Gsig(1 ).

1

path path

1

from commitment cm(cold) to root rt

  1. Set pp := (pkPOUR, vkPOUR, ppenc, ppsig).

2

new

2

new

  1. Output pp.
  • new values v1 , v2
  • new addresses public keys addrnew , addrnew

CreateAddress

  • public value vpub
  • transaction string info

pk,1

pk,2

  • inputs: public parameters pp
  • outputs: address key pair (addrpk, addrsk)

1.Compute ( pkenc, skenc) := Kenc(ppenc).

  • outputs: new coins c1 , c2 and pour transaction txPour
  1. For each i 1, 2 :

new new

∈ { }

    1. Parse cold as (addrold , vold, ρold, rold, sold, cmold).
  1. Randomly sample a PRFaddr seed a .

i pk,i i

i i i i

sk (b)Parse addrold as (aold , skold ).

  1. Compute apk = PRFaddr(0).

sk,i

sk,i

enc,i

ask

    1. Compute snold := PRFsn

(ρold).

  1. Set addrpk

:= (apk

, pkenc).

i

new

enc,i

new

old i

sk,i

a

  1. Set addrsk := (ask

, skenc).

    1. Parse addrpk,i as (apk,i, pknew ).
  1. Output ( addrpk

, addrsk).

    1. Randomly sample a PRFsn seed ρnew.
    2. Randomly sample two COMM trapdoors rnew, snew. (g)Compute knew := COMMrnew (anewρnew).

i

i i

Mint

i

new

i pk,i i

      1. Compute cmi := COMMsnew (vnewknew).
  • inputs:

new

new

new i i i

new

      1. Set ci := (addrpk

, vi , ρnew, rnew, snew, cmi ).

    • public parameters pp

,i i i i

      1. Set Ci := Eenc(pknew , (vnew, ρnew, rnew, snew)).
    • coin value v ∈ {0, 1, . . . , vmax}

enc,i i

i i i

    • destination address public key addrpk
  • outputs: coin c and mint transaction txMint
  1. Parse addrpk as (apk, pkenc).
  2. Generate ( pksig , sksig) := Ksig(ppsig ).
  3. Compute hSig := CRH(pksig ).
  4. Compute h1 := PRFpk (1 hSig) and h2 := PRFpk

aold

aold

sk,1 sk,2

(2″hSig).

sn 5.Set x := (rt, snold, snold, cmnew, cmnew, vpub, hSig, h1, h2).

  1. Randomly sample a PRF

seed ρ.

1 2 1 2

6.Set a := (path , path , cold, cold, addrold , addrold , cnew, cnew).

  1. Randomly sample two COMM trapdoors r, s.

1 2 1 2

sk,1

sk,2 1 2

  1. Compute k := COMMr(apkρ). 5.Compute cm := COMMs(vk). 6.Set c := (addrpk, v, ρ, r, s, cm).
  2. Set txMint := (cm, v, ∗), where ∗ := (k, s).
    1. Compute πPOUR := Prove(pkPOUR, x, a).
    2. Set m := (x, πPOUR, info, C1, C2). 9.Compute σ := Ssig(sksig, m).

10.Set txPour := (rt, snold, snold, cmnew, cmnew, vpub, info, ∗), where

1 2 1 2

  1. Output c and txMint.

VerifyTransaction

  • inputs:
    • public parameters pp
    • a (mint or pour) transaction tx
    • the current ledger L
  • outputs: bit b, equals 1 iff the transaction is valid 1.If given a mint transaction tx = txMint:
    1. Parse txMint as (cm, v, ∗), and ∗ as (k, s).
    2. Set cm := COMMs(v k).

j

    1. Output b := 1 if cm = cmj, else output b := 0.

2.If given a pour transaction tx = txPour:

:= (pksig, h1, h2, πPOUR, C1, C2, σ).

11.Output cnew, cnew and txPour.

1 2

Receive

  • inputs:
    • public parameters pp
    • recipient address key pair (addrpk, addrsk)
    • the current ledger L
  • outputs: set of received coins

1.Parse addrpk as (apk, pkenc). 2.Parse addrsk as (ask, skenc).

  1. For each Pour transaction txPour on the ledger:
    1. Parse txPour as (rt, snold, snold, cmnew, cmnew, vpub, info, ∗),

1

2

1

2

  1. Parse txPour as (rt, snold, snold, cmnew, cmnew, vpub, info, ∗), and

1

2

1

2

and ∗ as (pksig, h1, h2, πPOUR, C1, C2, σ).

∗ as (pksig, h1, h2, πPOUR, C1, C2, σ).

  1. If snold or snold appears on L (or snold = snold), output b := 0. (c)If the Merkle root rt does not appear on L, output b := 0. (d)Compute hSig := CRH(pk ).

1

2

1

2

sig

  1. For each i ∈ {1, 2}:
    1. Compute ( vi, ρi, ri, si) := Denc(skenc, Ci). ii.If Denc’s output is not ⊥, verify that:
      • cm equals COMMs (vi“COMMr (apkρi));

new

i

i

i

ask

sn

  1. Set x := (rt, snold, snold, cmnew, cmnew, v , h , h , h ).

1

2

1

2

pub

Sig

1

2

  • sni := PRF (ρi) does not appear on L.
  1. Set m := (x, πPOUR, info, C1, C2). (g)Compute b := Vsig(pksig , m, σ).

(h)Compute bj := Verify(vkPOUR, x, πPOUR), and output b bj.

  1. If both checks succeed, output

ci := (addrpk, vi, ρi, ri, si, cmnew).

i

Figure 2: Construction of a DAP scheme using zk-SNARKs and other ingredients.

Instantiation of building blocks

We instantiate the DAP scheme construction from Section 4 (see Figure 2), aiming at a level of security of 128 bits. Doing so requires concrete choices, described next.

CRH, PRF, COMM from SHA256. Let H be the SHA256 compression function, which maps a 512-bit input to a 256-bit output. We mostly rely on H, rather than the “full” hash, since this suffices for our fixed-size single-block inputs, and it simplifies the construction of CPOUR (see Section 5.2). We instantiate CRH, PRF, COMM via H (under suitable assumptions on H).

First, we instantiate the collision-resistant function CRH as H(z) for z ∈ {0, 1}512; this function

compresses “two-to-one”, so it can be used to construct binary Merkle trees.

14

Next, we instantiate the pseudorandom function PRFx(z) as H(xz), with x ∈ {0, 1}256 as the seed, and z ∈ {0, 1}256 as the input.15 Thus, the derived functions are:

PRFaddr(z) := H(x“00”z) , PRFsn(z) := H(x“01”z) , PRFpk(z) := H(x“10”z) ,

x x x

with x ∈ {0, 1}256 and z ∈ {0, 1}254.

As for the commitment scheme COMM, we only use it in the following pattern:

k := COMMr(apkρ) ,

cm := COMMs(vk) .

Due to our instantiation of PRF, apk is 256 bits. So we can set ρ also to 256 bits and r to 256 + 128 = 384 bits; then we can compute

k := COMMr(apkρ) as H(r“[H(apkρ)]128) .

Above, [·]128 denotes that we are truncating the 256-bit string to 128 bits (say, by dropping least- significant bits, as in our implementation). Heuristically, for any string z ∈ {0, 1}128, the distribution induced by H(rz) is 2−128-close to uniform, and this forms the basis of the statistically-hiding property. For computing cm, we set coin values to be 64-bit integers (so that, in particular, vmax = 264 − 1 in our implementation), and then compute

cm := COMMs(vk) as H(k“0192v) .

Noticeably, above we are ignoring the commitment randomness s. The reason is that we already know that k, being the output of a statistically-hiding commitment, can serve as randomness for the next commitment scheme.

Instantiating the NP statement POUR. The above choices imply a concrete instantiation of the NP statement POUR (see Section 4.2). Specifically, in our implementation, POUR checks that the following holds, for each i ∈ {1, 2}:

    • pathi is an authentication path for leaf cmold with respect to root rt, in a CRH-based Merkle tree;

i

sk,i

old pk,i

  • a

= H(aold “0256);

    • snold = H(aold “01”[ρold]254);

i

sk,i

i

    • cmold = H(H(rold“[H(aoldρold)]128)”0192vold);

i

i

pk,i

i

i

14A single exception: we still compute hSig according to the full hash SHA256, rather than its compression function,

because there is no need for this computation to be verified by CPOUR.

15This assumption is reminiscent of previous works analyzing the security of hash-based constructions (e.g., [Bel06]). However in this work we assume that a portion of the compression function is the seed for the pseudorandom function, rather than using the chaining variable as in [Bel06].

    • cmnew = H(H(rnew“[H(anewρnew)]128)”0192vnew); and

i

i

pk,i

i

i

    • hi = H(aold “10”bi“[hSig]253) where b1 := 0 and b2 := 1.

sk,i

Moreover, POUR checks that vnew + vnew + vpub = vold + vold, with vold, vold ≥ 0 and vold + vold < 264.

1

2

1

2

1

2

1

2

Finally, as mentioned, in order for CPOUR to be well-defined, we need to fix a Merkle-tree depth

dtree. In our implementation, we fix dtree = 64, and thus support up to 264 coins.

Instantiating Sig. For the signature scheme Sig, we use ECDSA to retain consistency and compatibility with the existing bitcoind source code. However, standard ECDSA is malleable: both (r, s) and (r, s) verify as valid signatures. We use a non-malleable variant, where s is restricted to the “lower half” of field elements. While we are not aware of a formal SUF-1CMA proof for this variant, its use is consistent with proposals to resolve Bitcoin transaction malleability [Wui14].16

Instantiating Enc. For the encryption scheme Enc, we use the key-private Elliptic-Curve Integrated Encryption Scheme (ECIES) [Cer00]; it is one of the few standardized key-private encryption schemes with available implementations.

Arithmetic circuit for pouring coins

Our DAP scheme construction from Section 4 (see Figure 2) also requires zk-SNARKs relative to the NP statement POUR. These are obtained by invoking a zk-SNARK for arithmetic circuit satisfiability (see Section 2.4) on an arithmetic circuit CPOUR, which verifies the NP statement POUR. In our instantiation, we rely on the implementation of [BCTV14] for the basic zk-SNARK (see Section 2.4), and apply it to the circuit CPOUR whose construction is described next.

An arithmetic circuit for verifying SHA256’s compression function

The vast majority of the “verification work” in POUR is verifying computations of H, the compression function of SHA256 (see Section 5.1). Thus, we begin by discussing our construction of an arithmetic circuit CH for verifying SHA256 computations. Later, in Section 5.2.2, we discuss the construction of CPOUR, given the circuit CH.

We wish to construct an arithmetic circuit CH such that, for every 256-bit digest h and 512-bit

input z, (h, z) ∈ RCH if and only if h = H(z). Naturally, our goal is to minimize the size of CH. Our high-level strategy is to construct CH, piece by piece, by closely following the SHA256 official specification [Nat12]. For each subcomputation of SHA256, we use nondeterminism and field operations to verify the subcomputation using as few gates as possible.

Overview of SHA256’s compression function. The primitive unit in SHA256 is a 32-bit word. All subcomputations are simple word operations: three bitwise operations (and, or, xor), shift-right, rotate-right, and addition modulo 232. The compression function internally has a state of 8 words, initialized to a fixed value, and then transformed in 64 successive rounds by following the 64-word message schedule (deduced from the input z). The 256-bit output is the concatenation of the 8 words of the final state.

Representing a state. We find that, for each word operation (except for addition modulo 232), it is more efficient to verify the operation when its inputs are represented as separate wires, each carrying a bit. Thus, CH maintains the 8-word state as 256 individual wires, and the 64-word message schedule as 64 · 32 wires.

Addition modulo 32. To verify addition modulo 232 we use techniques employed in previous

work [PGHR13, BCG+13, BCTV14]. Given two words A and B, we compute α := Σ31 2i(Ai + Bi).

i=0

16In practice, one might replace this ECDSA variant with an EC-Schnorr signature satisfying SUF-1CMA security with proper encoding of EC group elements; the performance would be similar.

Because F has characteristic larger than 233, there is no wrap around; thus, field addition coincides with integer addition. We then make a non-deterministic guess for the 33 bits αi of α (including

carry), and enforce consistency by requiring that α = Σ32 2iαi. To ensure that each αi ∈ {0, 1},

i=0

we use a 33-gate subcircuit computing αi(αi 1), all of which must be 0 for the subcircuit to be

satisfiable. Overall, verifying addition modulo 232 only requires 34 gates. This approach extends in a straightforward way to summation of more than two terms.

Verifying the SHA256 message schedule. The first 16 words Wi of the message schedule are the 16 words of the 512-bit input z. The remaining 48 words are computed as Wt := σ1(Wt−2) + Wt−7 + σ0(Wt−15) + Wt−16, where σ0(W ) := rotr7(W ) ⊕ rotr18(W ) ⊕ shr3(W ) and σ1 has the same structure but different rotation and shift constants.

The rotation and shift amounts are constants, so rotates and shifts can be achieved by suitable wiring to previously computed bits (or the constant 0 for high-order bits in shr). Thus, since the XOR of 3 bits can be computed using 2 gates, both σ0 and σ1 can be computed in 64 gates. We then compute (or more precisely, guess and verify) the addition modulo 232 of the four terms.

Verifying the SHA256 round function. The round function modifies the 8-word state by changing two of its words and then permuting the 8-word result.

Each of the two modified words is a sum modulo 232 of (i) round-specific constant words Kt;

  1. message schedule words Wt; and (iii) words obtained by applying simple functions to state words. Two of those functions are bitwise majority (Maj(A, B, C)i = 0 if Ai + Bi + Ci ≤ 1 else 1) and bitwise choice (Ch(A, B, C)i = Bi if Ai = 1, else Ci). We verify correct computation of Maj using 2 gates per output bit, and Ch with 1.

Then, instead of copying 6 unchanged state words to obtain the permuted result, we make the permutation implicit in the circuit’s wiring, by using output wires of previous sub-computations (sometimes reaching 4 round functions back) as input wires to the current sub-computation.

Performance. Overall, we obtain an arithmetic circuit CH for verifying SHA256’s compression function with less than 30 000 arithmetic gates. See Figure 3 for a breakdown of gate counts.

Gate count for CH
Message schedule 8032
All rounds 19 584
1 round (of 64) 306
Finalize 288
Total 27 904

Figure 3: Size of circuit CH for SHA256’s compression function.

Comparison with generic approaches. We constructed the circuit CH from scratch. We could have instead opted for more generic approaches: implement SHA256’s compression function in a higher-level language, and use a circuit generator to obtain a corresponding circuit. However, generic approaches are significantly more expensive for our application, as we now explain.

Starting from the SHA256 implementation in PolarSSL (a popular cryptographic library) [Pol13], it is fairly straightforward to write a C program for computing H. We wrote such a program, and gave it as input to the circuit generator of [PGHR13]. The output circuit had 58160 gates, more than twice larger than our hand-optimized circuit.

Alternatively, we also compiled the same C program to TinyRAM, which is the architecture supported in [BCG+13]; we obtained a 5371-instruction assembly code that takes 5704 cycles to execute on TinyRAM. We could then invoke the circuit generator in [BCG+13] when given this TinyRAM program and time bound. However, each TinyRAM cycle costs ≈ 1000 gates, so the resulting circuit would have at least 5.7 · 106 gates, i.e., over 190 times larger than our circuit. A

similar computation holds for the circuit generator in [BCTV14], which supports an even more flexible architecture.

Thus, overall, we are indeed much better off constructing CH from scratch. Of course, this is not surprising, because a SHA256 computation is almost a “circuit computation”: it does not make use of complex program flow, accesses to memory, and so on. Thus, relying on machinery developed to support much richer classes of programs does not pay off.

Arithmetic circuit for POUR

The NP statement POUR requires verifying membership in a Merkle tree based on H, a few additional invocations of H, and integer addition and comparison. We construct the circuit CPOUR for POUR by combining various subcircuits verifying each of these. There remains to to discuss the subcircuits for verifying membership in a Merkle tree (using the aforementioned subcircuit CH for verifying invocations of H), and integer addition and comparison.

Merkle tree membership. We need to construct an arithmetic circuit that, given a root rt, authentication path path, and coin commitment cm, is satisfied if and only if path is a valid authentication path for the leaf cm with respect to the root rt. The authentication path path includes, for each layer i, an auxiliary hash value hi and a bit ri specifying whether hi was the left (ri = 0) or the right (ri = 1) child of the parent node. We then check membership in the Merkle tree by verifying invocations of H, bottom-up. Namely, for d = 64, we set kd−1 = cm; then, for each i = d − 1, . . . , 1, we set Bi = hiki if ri = 0 else kihi, and compute ki−1 = H(Bi). Finally we check that the root k0 matches the given root rt.

Integer addition. We need to construct an arithmetic circuit that, given 64-bit integers A, B, C

(presented as binary strings), is satisfied if and only if C = A + B over the integers. Again relying

on the fact that F’s characteristic is sufficiently large, we do so by checking that Σ63 2ici =

Σ63 i

i=0

i=0 2 (bi + ai) over F; this is enough, because there is no wrap around.

Integer comparison. We need to construct an arithmetic circuit that, given two 64-bit integers

A, B (represented in binary), is satisfied if and only if A + B fits in 64 bits (i.e. A + B < 264). We

do so by checking that Σ63 2i(bi + ai) = Σ63 ci for some ci ∈ {0, 1}. Indeed, if A + B < 264 then

i=0

i=0

it suffices to take ci as the binary representation of A + B. However, if A + B ≥ 264 then no choice

of ci can satisfy the constraint as Σ63 ci ≤ 264 − 1. Overall, this requires 65 gates (1 gate for the

i=0

equality check, and 64 gates for ensuring that c0, . . . , c63 are boolean).

Overall circuit sizes. See Figure 4 for the size of CPOUR. More than 99% of the gates are devoted to verifying invocations of H.

Gate count for CPOUR
Ensure cmold is in Merkle tree

1

(1 layer out of 64)

Ensure cmold is in Merkle tree

2

(1 layer out of 64)

Check computation of snold, snold

1 2

Check computation of aold , aold

pk,1 pk,2

Check computation of cmold, cmold, cmnew, cmnew

1 2 1 2

Check computation of h1, h2

Ensure that vnew + vnew + vpub = vold + vold

1 2 1 2

Ensure that vold + vold < 264

1 2

Miscellaneous

1 802 304

(28 161)

1 802 304

(28 161)

2 × 27 904

2 × 27 904

4 × 83 712

2 × 27 904

1

65

2384

Total 4 109 330

Figure 4: Size of the circuit CPOUR, which verifies the statement POUR.

Integration with existing ledger-based currencies

Zerocash can be deployed atop any ledger (even one maintained by a central bank). Here, we briefly detail integration with the Bitcoin protocol. Unless explicitly stated otherwise, in the following section when referring to Bitcoin, and its unit of account bitcoin (plural bitcoins), we mean the underlying protocol and software, not the currency system. (The discussion holds, with little or no modification, for many forks of Bitcoin, also known as “altcoins”, such as Litecoin.)

By introducing new transaction types and payment semantics, Zerocash breaks compatibility with the Bitcoin network. While Zerocash could be integrated into Bitcoin (the actual currency and its supporting software) via a “flag day” where a super-majority of Bitcoin miners simultaneously adopt the new software, we neither expect nor advise such integration in the near future and suggest using Zerocash in a separate altcoin.

Integrating Zerocash into Bitcoin consists of adding a new transaction type, Zerocash transactions, and modifying the protocol and software to invoke Zerocash’s DAP interface to create and verify these transactions. There are at least two possible approaches to this integration. The first approach replaces all bitcoins with zerocoins, making all transactions anonymous at the cost of losing any additional Bitcoin functionality provided by, e.g., the Bitcoin scripting language (see Section 6.1). The second approach maintains this functionality, adding a parallel Zerocash currency, zerocoin, which can be converted to and from bitcoin at a one-to-one rate (see Section 6.2). Options for protocol-level modifications for the later approach are discussed in Section 6.3; the former can be readily inferred. In Section 6.4 we discuss anonymizing the network layer of Bitcoin and anonymity safeguards.

Integration by replacing the base currency

One approach is to alter the underlying system so that all monetary transactions are done using Zerocash, i.e., by invoking the DAP interface and writing/reading the associated transactions in the distributed ledger.

As seen in Section 3, this suffices to offer the core functionality of payments, minting, merging, splitting, etc., while assuring users that all transactions using this currency are anonymous. However, this has several drawbacks: (1) All pour transactions incur the cost of generating a zk-SNARK proof.

(2) If Bitcoin supports additional features, such as a scripting language for specifying conditions for claiming bitcoins (as in Bitcoin), then these features are lost.17 (3) Bitcoin allows the flexibility of spending unconfirmed transactions; instead, with a Zerocash-only Bitcoin, this flexibility is lost: transactions must be confirmed before they can be spent. (And this imposes a minimal delay between receiving funds and spending them.)

Integration by hybrid currency

A different approach is to extend Bitcoin with a parallel, anonymized currency of “zerocoins”, existing alongside bitcoins, using the same ledger, and with the ability to convert freely between the two. The behavior and functionality of regular bitcoins is unaltered; in particular, they may support functionality such as scripting.

In this approach, the Bitcoin ledger consists of Bitcoin-style transactions, containing inputs and outputs [Nak09]. Each input is either a pointer to an output of a previous transaction (as in plain Bitcoin), or a Zerocash pour transaction (which contributes its public value, vpub, of bitcoins to this transaction). Outputs are either an amount and destination public address/script (as in plain

17However, in principle POUR could be extended to include a scripting language interpreter.

Bitcoin), or a Zerocash mint transaction (which consumes the input bitcoins to produce zerocoins). The usual invariant over bitcoins is maintained and checked in plain view: the sum of bitcoin inputs (including pours’ vpub) must be at least the sum of bitcoin outputs (including mints’ v), and any difference is offered as a transaction fee. However, the accounting for zerocoins consumed and produced is done separately and implicitly by the DAP scheme.

The life cycle of a zerocoin is as follows.

Creating new zerocoins. A mint transaction consumes v worth of bitcoins as inputs, and outputs coin commitment worth v zerocoins. The v bitcoins are effectively destroyed, in exchange for the newly-minted zerocoins.

Spending zerocoins. Zerocoins can then be transferred, split, and merged into other zerocoins arbitrarily, via pour transactions which, instead of explicit inputs, include zero-knowledge proofs that such inputs exist. Pour transactions may optionally reveal a non-zero public output vpub. This is either left unclaimed as a transaction fee,18 placed into a standard Bitcoin transaction output (e.g., one paying to a public key) or consumed by a mint transaction. Thus, vpub bitcoins are created ex nihilo (similarly to how coinbase transactions produce bitcoin outputs as mining reward), in exchange for destroying that amount of zerocoins. The Bitcoin outputs must be included in the transaction string info, which is included as part of a pour transaction; transaction non-malleability ensures that all this information is bound together.

Spending multiple zerocoins. To allow for pours to span more than two input and output coins, txPour structures may be chained together within one transaction by marking some output coin commitments as intermediates and having subsequent pours in the same transaction constructed relative to an ephemeral Merkle tree consisting of only the intermediates commitments. For example, a transaction might accept four input coins, with the first two Pour operations combining two of the inputs to produce an intermediate commitment each and a final Pour combining the two intermediate commitments into a final output new coin. Since the intermediate results are consumed instantly within the transaction, they need not be recorded in the global Merkle tree or have their serial numbers marked as spent.

Transaction fees. Collecting transaction fees is done as usual, via a coinbase transaction added to each block, which pays as mining reward the difference between the total inputs (bitcoin and pours’ vpub) and total outputs (bitcoin and mints’ v) in this block. Payment is either in bitcoins or in newly-minted zerocoins (via a Mint).

Validation and block generation. All transactions are verified via VerifyTransaction when they are received by a node. Any plain Bitcoin inputs and outputs are processed as usual, and any Zerocash inputs and outputs are checked using VerifyTransaction with the entire Bitcoin transaction fed in as info for authentication. Once these transactions are assembled into a candidate block, each transaction needs to be verified again to ensure its serial number has not become spent or its Merkle root invalid. If these checks pass, the set of new coin commitments and spent serial numbers output by the included transactions are added to the global sets, and the new Merkle root and a digest of the serial number list is stored in the new block.19 Embedding this data simplifies statekeeping and allows nodes to readily verify they have the correct coin list and serial number list. Upon receiving a candidate block, nodes validate the block is formed correctly with respect to the above procedure.

Receiving payments. In order to receive payments to an address, users may scan the block chain by running the Receive on every pour transaction. Alternatively they may receive coin information

18Since transaction fees may potentially be claimed by any node in the network, they represent the sole zerocoin output that cannot be hidden from public view even in a Zerocash-only system.

19This can be stored in the coinbase transaction, as certain other data currently is, or in a new field in the block

header.

via some out-of-band mechanism (e.g., via encrypted email). The former process is nearly identical to the one proposed for “stealth addresses” for Bitcoin. In the worst case, scanning the block chain requires a trial decryption of every ciphertext C. We expect many scenarios to provide explicit notification, e.g., in interactive purchases where a communication channel already exists from the payer to the payee. (Implementations may opt to drop the receive mechanism entirely, and require out-of-band notification, in order to avoid storing the ciphertexts in the block chain.)

Extending the Bitcoin protocol to support the combined semantics

While the section above describes the life-cycle of a zerocoin and semantics of the system, there remains the question of how transactions acquire the above necessary semantics. Two implementation approaches are possible, with different engineering tradeoffs.

The first approach is to extend the protocol and its implementation with hard-coded validation of Zerocash transactions, reading them from new, designated fields in transactions and running VerifyTransaction. In this case the zk-SNARK itself effectively replaces the scripting language for Zerocash transactions.

The second approach is to extend Bitcoin’s scripting language by adding an opcode that invokes VerifyTransaction, with the requisite arguments embeded alongside the opcode script. Such transactions must be exempt from the requirement they reference an input (as they are Zerocash transactions are self-contained), and, like coinbase transactions, be able to create bitcoins ex nihilo (to account for vpub). Moreover, while VerifyTransaction is run at the standard point in the Bitcoin transaction processing flow for evaluating scripts, the coin commitments and spent serial numbers are not actually added to CMList (resp., SNList) until their containing block is accepted (i.e., merely verifying a transaction does not have side effects).

Additional anonymity considerations

Zerocash only anonymizes the transaction ledger. Network traffic used to announce transactions, retrieve blocks, and contact merchants still leaks identifying information (e.g., IP addresses). Thus users need some anonymity network to safely use Zerocash. The most obvious way to do this is via Tor [DMS04]. Given that Zerocash transactions are not low latency themselves, Mixnets (e.g., Mixminion [DDM03]) are also a viable way to add anonymity (and one that, unlike Tor, is not as vulnerable to traffic analysis). Using mixnets that provide email-like functionality has the added benefit of providing an out-of-band notification mechanism that can replace Receive.

Additionally, although in theory all users have a single view of the block chain, a powerful attacker could potentially fabricate an additional block solely for a targeted user. Spending any coins with respect to the updated Merkle tree in this “poison-pill” block will uniquely identify the targeted user. To mitigate such attacks, users should check with trusted peers their view of the block chain and, for sensitive transactions, only spend coins relative to blocks further back in the ledger (since creating the illusion for multiple blocks is far harder).

Experiments

To measure the performance of Zerocash, we ran several experiments. First, we benchmarked the performance of the zk-SNARK for the NP statement POUR (Section 7.1) and of the six DAP scheme algorithms (Section 7.2). Second, we studied the impact of a higher block verification time via a simulation of a Bitcoin network (Section 7.3).

Performance of zk-SNARKs for pouring coins

Our zk-SNARK for the NP statement POUR is obtained by constructing an arithmetic circuit CPOUR for verifying POUR, and then invoking the generic implementation of zk-SNARK for arithmetic circuit satisfiability of [BCTV14] (see Section 2.4). The arithmetic circuit CPOUR is built from scratch and hand-optimized to exploit nondeterministic verification and the large field characteristic (see Section 5.2) .

Figure 5 reports performance characteristics of the resulting zk-SNARK for POUR. This includes three settings: single-thread performance on a laptop machine; and single-thread and multi-thread performance on a desktop machine. (The time measurements are the average of 10 runs, with standard deviation under 2.5%.) For instance, with single-thread code on the laptop machine, we obtain that:

    • Key generation takes 7 min 48 s, and results in a proving key pkPOUR of 896 MiB and a verification key vkPOUR of 749 B. This is performed only once, as part of the Setup algorithm.
    • Producing a proof πPOUR requires about 3 minutes; proofs have a constant size of 288 B. Proof generation is a subroutine of the Pour algorithm, and the resulting proof is included in the corresponding pour transaction.
    • A proof πPOUR can be verified in only 8.5 ms. Proof verification is a subroutine of the VerifyTransaction algorithm, when it is given as input a pour transaction to verify.
Intel Intel
Core i7-2620M Core i7-4770
@ 2.70GHz @ 3.40GHz
12GB of RAM 16GB of RAM
1 thread 1 thread 4 threads
KeyGen Time 7 min 48 s 5 min 11 s 1 min 47 s
Proving key 896 MiB
Verification key 749 B
Prove Time 2 min 55 s 1 min 59 s 46 s
Proof 288 B
Verify Time 8.5 ms 5.4 ms

Figure 5: Performance of our zk-SNARK for the NP statement POUR. (N = 10, σ ≤ 2.5%)

Performance of Zerocash algorithms

In Figure 6 we report performance characteristics for each of the six DAP scheme algorithms in our implementation (single-thread on our desktop machine). For VerifyTransaction, we separately report the cost of verifying mint and pour transactions and, in the latter case, we exclude the cost of scanning L (e.g., to check if a serial number is duplicate);20 for the case of Receive, we report the cost to process a given pour transaction in L.

We obtain that:

    • Setup takes about 5 minutes to run; its running time is dominated by the running time of KeyGen on CPOUR. (Either way, Setup is run only once.) The size of the resulting public parameters pp is dominated by the size of pkPOUR.
    • CreateAddress takes 326.0 ms to run. The size of the resulting address key pair is just a few

hundred bytes.

20Naturally, if SNList has 264 serial numbers (the maximum possible in our implementation), then scanning is very expensive! However, we do not expect that a system like Zerocash will grow to 264 transactions. Still, such a system may grow to the point that scanning SNList is too expensive. We detail possible mitigations to this in Section 8.3.2.

    • Mint takes 23 µs to run. It results in a coin of size 463 B and mint transaction of size 72 B.
    • Pour takes about 2 minutes to run. Besides Setup, it is the only “expensive” algorithm to run; as expected, its running time is dominated by the running time of Prove. For a transaction string info, it results in (two new coins and) a pour transaction of size 996 B + |info|.
    • VerifyTransaction takes 8.3 µs to verify a mint transaction and 5.7 ms to verify a pour transaction; the latter’s time is dominated by that of Verify, which checks the zk-SNARK proof πPOUR.
    • Receive takes 1.6 ms per pour transaction.

Note that the above numbers do not include the costs of maintaining the Merkle tree because doing so is not the responsibility of the DAP scheme algorithms. Nevertheless, these additional costs are not large: (i) each update of the root of the CRH-based Merkle tree only requires dtree invocations of CRH, and (ii) an authentication path consists of only dtree digests of CRH. In our implementation, where CRH = H (the SHA256 compression function) and dtree = 64, each update requires 64 invocations of H and an authentication path requires 64 · 32 B = 2 KiB of storage.

Remark. If one does not want to rely on the ledger to communicate coins, via the ciphertexts C1, C2, and instead rely instead on some out-of-band mechanism (e.g., encrypted email), then the Receive algorithm is not needed, and moreover, many of the aforementioned sizes decrease because some pieces of data are not needed anymore; we denoted these pieces of data with “>” in Figure 6. (E.g., the size of an address key pair is reduced to only 64 B, and the size of a coin to only 120 B.)

Large-scale network simulation

Because Bitcoin mining typically takes place on dedicated GPUs or ASICs, the CPU resources to execute the DAP scheme algorithms are often of minimal consequence to network performance. There is one potential exception to this rule: the VerifyTransaction algorithm must be run by all of the network nodes in the course of routine transaction validation. The time it takes to perform this verification may have significant impact on network performance.

In the Zerocash implementation (as in Bitcoin), every Zerocash transaction is verified at each hop as it is forwarded though the network and, potentially, again when blocks containing the transaction are verified. Verifying a block consists of checking the proof of work and validating the contained transactions. Thus Zerocash transactions may take longer to spread though the network and blocks containing Zerocash transactions may take longer to verify. While we are concerned with the first issue, the potential impact of the second issue is cause for greater concern. This is because Zerocash transactions cannot be spent until they make it onto the ledger.

Because blocks are also verified at each hop before they are forwarded through the network, delays in block verification slow down the propagation of new blocks through the network. This causes nodes to waste CPU-cycles mining on out-of-date blocks, reducing the computational power of the network and making it easier to mount a “51% attack” (dishonest majority of miners) on the distributed ledger.

It is a priori unclear whether this potential issue is a real concern. Bitcoin caches transaction verifications, so a transaction that was already verified when it propagated through the network need not be verified again when it is seen in a block. The unknown is what percentage of transactions in a block are actually in any given node’s cache. We thus conduct a simulation of the Bitcoin network to investigate both the time it takes Zerocash transactions to make it onto the ledger and establish the effects of Zerocash transactions on block verification and propagation. We find that Zerocash transactions can be spent reasonably quickly and that the effects of increased block validation time are minimal.

Intel

Core i7-4770 @ 3.40GHz 16GB of RAM

1 thread
Setup Time 5 min 17 s
Size of pp

size of pkPOUR

size of vkPOUR

y size of ppenc

size of ppsig

896 MiB

896 MiB

749 B

0 B

0 B

CreateAddress Time 326.0 ms
Size of addrpk

size of apk

y size of pkenc

343 B

32 B

311 B

Size of addrsk

size of ask

y size of skenc

319 B

32 B

287 B

Mint Time 23 µs
Size of coin c

size of addrpk

size of v size of ρ size of r size of s size of cm

463 B

343 B

8 B

32 B

48 B

0 B

32 B

Size of txMint

size of cm size of v size of k size of s

72 B

32 B

8 B

32 B

0 B

Pour Time 2 min 2.01 s
Size of txPour

size of rt

size of snold, snold

1 2

size of cmnew, cmnew

1 2

size of vpub size of info size of pksig

size of h1, h2

size of πPOUR y size of C1, C2

size of σ

996 B + |info|

32 B

2 × 32 B

2 × 32 B

8 B

|info|

66 B

2 × 32 B

288 B

2 × 173 B

64 B

VerifyTransaction Time for mint tx 8.3 µs
Time for pour tx (excludes L scan) 5.7 ms
Receive Time (per pour tx) 1.6 ms

Figure 6: Performance of Zerocash algorithms. Above, we report the sizes of ppenc and ppsig as 0 B, because these parameters are “hardcoded” in the libraries we rely on for Enc and Sig. (N = 10 with σ ≤ 2.5% for all except that, due to variability at short timescales, σ(Mint) ≤ 3.3 µs and σ(VerifyTransaction) ≤ 1.9 µs)

Simulation design. Because Zerocash requires breaking changes to the Bitcoin protocol, we cannot test our protocol in the live Bitcoin network or even in the dedicated testnet. We must run our own private testnet. For efficiency and cost reasons, we would like to run as many Bitcoin nodes as possible on the least amount of hardware. This raises two issues. First, reducing the proof of work to practical levels while still preserving a realistic rate of new blocks is difficult (especially on

virtualized hardware with variable performance). Second, the overhead of zk-SNARK verification prevents us from running many Bitcoin nodes on one virtualized server.

The frequency of new blocks can be modeled as a Poisson process with a mean of Λblock seconds.21 To generate blocks stochastically, we modify bitcoind to fix its block difficulty at a trivial level 22 and run a Poisson process, on the simulation control server, which trivially mines a block on a randomly selected node. This preserves the distribution of blocks, without the computational overhead of a real proof of work. Another Poisson process triggering mechanism, with a different mean Λtx, introduces new transactions at random network nodes.

To differentiate which transactions represent normal Bitcoin expenditures versus which contain Zerocash pour transactions, simulated Zerocash transactions pay a unique amount of bitcoins (we set this value arbitrarily at 7 BTC). If a transaction’s output matches this preset value, and it is not in verification cache, then our modified Bitcoin client inserts a 10 ms delay simulating the runtime of VerifyTransaction.23 Otherwise transactions are processed as specified by the Bitcoin protocol. We vary the amount of simulated Zerocash traffic by varying the number of transactions with this particular output amount. This minimizes code changes and estimates only the generic impact of verification delays and not of any specific implementation choice.

Methodology. Recent research [DW13] suggests that the Bitcoin network contains 16,000 distinct nodes though most are likely no longer participating: approximately 3,500 are reachable at any given time. Each node has an average of 32 open connections to randomly selected peers. As of November 2013, the peak observed transaction rate for Bitcoin is slightly under one transaction per second [Lee13].

In our simulation, we use a 1000-node network in which each node has an average of 32 peers, transactions are generated with a mean of Λtx = 1 s, a duration of 1 hour, and a variable percentage s of Zerocash traffic. To allow for faster experiments, instead of generating a block every 10 minutes as in Bitcoin, we create blocks at an average of every Λblock = 150 s (as in Litecoin, a popular altcoin).

We run our simulation for different traffic mixes, where s indicates the percentage of Zerocash transactions and s ∈ {0%, 25%, 50%, 75%, 100%}. Each simulation is run on 200 Amazon EC2 general-purpose m1.medium instances, in one region on a 10.10./16 private network. On each instance, we deploy 5 instances of bitcoind.24

Results. Transactions are triggered by a blocking function call on the simulation control node that must connect to a random node and wait for it to complete sending a transaction. Because the Poisson process modeling transactions generates delays between such calls and not between the exact points when the node actuals sends the transactions, the actual transaction rate is skewed. In our experiments the real transaction rate shifts away from our target of one per second to an average of one every 1.4 seconds.

In Figure 7 we plot three metrics for s ∈ {0%, 25%, 50%, 75%, 100%}. Each is the average defined over the data from the entire run of the simulation for a given s (i.e., they include multiple transactions and blocks).25 Transaction latency is the interval between a transaction’s creation and

21Since computational power is added to the Bitcoin network faster than the 2-week difficulty adjustment period, the frequency of block generation is actually skewed. As our experiments run for at most an hour, we ignore this.

22These code modifications have been rendered moot by the subsequent inclusion of a “regtest” mode in Bitcoin 0.9

that allows for precisely this type of behavior and block generation on command. At the time of our experiments, this feature was not available in a stable release. Future work should use this feature.

23We used a generous delay of 10 ms (higher than the time reported in Figure 6) to leave room for machines slower

than our desktop machine.

24Higher densities of nodes per VM resulted in issues initializing all of the bitcoind instances on boot.

25Because our simulated Bitcoin nodes ran on shared EC2 instances, they were subject to variable external load,

its inclusion in a block. Block propagation time comes in two flavors: (1) the average time for a new block to reach a node computed over the times for all nodes, and (2) the same average computed over only the last node to see the block.

Block verification time is the average time, over all nodes, required to verify a block. If verification caching was not effective, we would expect to see a marked increase in both block verification time and propagation time. Since blocks occur on average every 150 s, and we expect approximately one transaction each second, we should see 150 × 10 ms = 1500 ms of delay if all transactions were non-cached Zerocash transactions. Instead, we see worst case 80 ms and conclude caching is effective. This results in a negligible effect on block propagation (likely because network operations dominate). The time needed for a transaction to be confirmed, and hence spendable, is roughly 190 s. For slower block generation rates (e.g., Bitcoin’s block every 10 minutes) this should mean users must

wait only one block before spending received transactions.

200

Zerocash

180

160

140

time in seconds

time in seconds

120

100

80

60

40

20

0

0% 20% 40% 60% 80% 100%

ε

(a) Transaction latency

4.5

4

last node

every node

3.5

time in milliseconds

3

2.5

2

1.5

1

0.5

0

0% 20% 40% 60% 80% 100%

ε

  1. Block propagation time

80

70

Zerocash

60

50

40

30

20

10

0

0% 20% 40% 60% 80% 100%

ε

  1. Block verification time

Figure 7: The average values of the three metrics we study, as a function of g, the percentage of transactions that are Zerocash transactions. Note that, in (a), latency is undefined when g = 0 and hence omitted.

Optimizations and extensions

We outline several optimizations and extensions to Zerocash: everlasting anonymity (Section 8.1), faster block propagation (Section 8.2), and improved storage requirements (Section 8.3).

Everlasting anonymity

Since transactions may persist virtually forever on the ledger, users may wish to ensure the anonymity of their transactions also lasts forever, even if particular primitives are eventually broken (by cryptanalytic breakthrough, engineering progress, or quantum computers). As we now explain, the DAP scheme construction described in Section 4 is only computationally private, but can be modified to achieve everlasting anonymity.

Recall that every Pour operation publishes a pour transaction txPour = (rt, snold, snold, cmnew, cmnew,

1 2 1 2

vpub, info, ∗), where ∗ = (pksig, h1, h2, πPOUR, C1, C2, σ) and Ci = Eenc(pknew , (vnew, ρnew, rnew, snew)).

Observe that:

enc,i i

i i i

limiting the benchmark precision. Still, it clearly demonstrates that the mild additional delay does not cause catastrophic network effects.

    • Since hSig = CRH(pksig) and hi = PRF old (hSig), an unbounded adversary A can iterate over all x

pk

ask,i

until PRFpk(hSig) equals hi; with overwhelming probability, there is only one such x, in which

x

case it equals aold . Thus, A learns aold , and hence aold

a

sk,i

sk,i

pk,i

old sk,i

:= PRFaddr(0). This identifies the sender.

    • An unbounded A can also decrypt Ci, so to learn (vnew, ρnew, rnew, snew); then, A can try all

i

i

i

i

possible x until COMMsnew (vnew“COMMrnew (PRFaddr(0)”ρnew)) equals cmnew; with overwhelming

i

i

i

x

i

i

probability, there is only one such x, in which case it equals anew. This identifies the recipient.

sk,i

The above attacks can be prevented as follows. First, every sender must use any given address only once (for receiving or sending coins): after receiving a coin c, a user u should immediately generate a new address and pour c into a fresh one cj relative to the new address; only afterwards can u spend the coin. Second, a user should not put any data in a ciphertext Ci to communicate a coin’s information, but must instead use some (informationally-secure) out-of-band channel to do so. With these modifications (and recalling that COMM is statistically hiding and πPOUR is a perfect-zero-knowledge proof), one can verify that the pour transaction txPour is statistically hiding, i.e., leaks no information even to unbounded adversaries.26

Fast block propagation

As mentioned in Section 7.3, the higher block-verification time of Zerocash compared to, e.g., Bitcoin does not affect much block propagation. Even so, we note a simple modification that further mitigates concerns. Upon receiving a block, a node validates the proof of work and (optionally) transactions other than mint and pour, and then forward the block right away. Only afterwards, the node executes VerifyTransaction on any mint/pour transactions, before accepting it for use in transacting. Thus, blocks are still validated by every node (so the security properties are unhampered), and propagation delays in the broadcast of blocks are reduced.

In principle, this opens the possibility of a denial-of-service attack, in which the network is spammed with invalid blocks which pass the proof-of-work check but contain invalid mint or pour transactions. However, this attack appears unrealistic given the enormous (by design) cost of creating blocks passing the proof-of-work check.

Improved storage requirements

Beyond the ledger L, users need to maintain two lists: CMList, the list of all coin commitments, and SNList, the list of all serial numbers of spent coins (see Section 3.1). In our construction, CMList is required to deduce authentication paths to create new pour transactions (via Pour), while SNList is used to verify pour transactions (via VerifyTransaction). As the ledger grows, both CMList and SNList grow in size, and can eventually impose substantial storage requirements (though both are derived from, and smaller than, the block chain per se). We now explain how these storage requirements can be mitigated, by relying on smaller representations of CMList and SNList that suffice within our construction.

Supporting many coin commitments

To execute the Pour algorithm to spend a coin c, a user u needs to provide an authentication path from c’s coin commitment to rt, the Merkle-tree root over CMList. If we make the following protocol modifications, u does not need all of CMList to compute this authentication path.

26As for mint transactions, one can verify that they are already statistically hiding, without any modifications.

In each block B of transactions, we store the Merkle-tree path pathB from the first coin commitment in B to the root rtB of the Merkle tree over CMList when the last block in the ledger is B. (In Zerocash, the additional per-block storage cost to store this information is only 2 KiB.)

Note that, given a block B and its successor block Bj, the corresponding authentication paths pathB and pathBj can be easily checked for consistency as follows. Let CMListB and CMListBj be the two lists of coin commitments corresponding to the two ledgers ending in block B and Bj respectively; since CMListB (i.e., coin commitments to “to the left” of pathB ) is a prefix of CMListBj , pathBj can be computed from pathB and B in time O(|B|dtree), where dtree is the tree depth.

When the user u first receives (or mints) the coin c, and its coin commitment is included in a

block B, u immediately computes pathB , by using the predecessor block and its authentication path. Afterwards, each time a new block is added to the ledger, u obtains a new path for c by using the new block and the old path for c. Thus, u only needs to act each time a new block is added, and each such update costs O(dtree) per transaction in the block.

Overall, u incurs a storage requirement of only O(dtree) for each coin he owns, and does not need to store CMList anymore.

Supporting many spent serial numbers

To execute the VerifyTransaction algorithm on a pour transaction txPour, a user u needs access to SNList (in order to check for duplicate serial numbers). Note, in Bitcoin, nodes need to maintain only the list of unspent transaction outputs, which is pruned as outputs are spent. In a DAP scheme, in contrast, nodes have to maintain SNList, which is a list that always grows. We now explain how to mitigate this storage requirement, in three incremental steps.

Step 1. The first step is to build a Merkle tree over SNList so to allow easy-to-verify non-membership proofs for SNList; this can be done by letting the leaves of the Merkle tree be the intervals of unspent serial numbers. Then, given the root rt of such tree, a serial number sn claimed to be unspent, and an authentication path path for an interval I, the user can check that path is valid for rt and that sn lies in I; the root rt and path path would be part of the pour transaction txPour to be verified. The problem with this approach, however, is that generating path (and also updating rt) requires knowledge of all of SNList.

Step 2. Next, instead of maintaining SNList in a single Merkle tree, we divide SNList, maintaining its chronological order, into sublists of serial numbers SNList0, SNList1, . . . and build a Merkle tree over the intervals induced by each sublist (i.e., apply Step 1 to each sublist). This modification implies a corresponding modification for the auxiliary information stored in a pour transaction that allows VerifyTransaction to check it. Now, however, producing such auxiliary information is less expensive. Indeed, a user with a coin c should maintain a list of authentication paths pathc,0, pathc,1, . . . (one for each sublist). Only the last path, corresponding to the active sublist, needs to be updated when a serial number is added; the other sublists and authentication paths remain unchanged (and these old sublists can in fact be discarded). When the user spends the coin, he can simply include these paths in the pour transaction. While updating these paths is an efficient operation, computing the initial paths for c is not, as it still requires the full set of sublists.

Step 3. To enable users to avoid the initial cost of computing paths for a new coin, we proceed as follows. First, a coin c is extended to contain a time stamp Tc corresponding to when c is created (minted or poured into); the coin’s commitment is modified to depend on the timestamp, and the timestamp is included in the clear within the transaction that creates the coin. Then, a user, upon spending c, produces a zk-SNARK for the following NP statement: “for each Merkle-tree root created (or updated) after Tc there is an interval and an authentication path for that interval

such that the serial number of c is in that interval”. Depending on the number of Merkle trees in such an NP statement, such proofs may already be more efficient to produce, compared to the naive (Step 1) solution, using existing zk-SNARK implementations.

Concurrent work

Danezis et al. [DFKP13] suggest using zk-SNARKs to reduce proof size and verification time in Zerocoin. Our work differs from [DFKP13] in both supported functionality and scalability.

First, [DFKP13]’s protocol, like Zerocoin, only supports fixed-value coins, and is best viewed as a decentralized mix. Instead, we define, construct, and implement a full-fledged decentralized electronic currency, which provides anonymous payments of any amount.

Second, in [DFKP13], the complexity of the zk-SNARK generator, prover, and verifier all scale superlinearly in the number of coins, because their arithmetic circuit computes, explicitly, a product over all coins. In particular, the number of coins “mixed together” for anonymity cannot be large. Instead, in our construction, the respective complexities are polylogarithmic, polylogarithmic, and constant in the number of coins; our approach supports a practically-unbounded number of coins. While we do not rely on Pedersen commitments, our approach also yields statistical (i.e.,

everlasting) anonymity; see the discussion in Section 8.1.

Conclusion

Decentralized currencies should ensure a user’s privacy from his peers when conducting legitimate financial transactions. Zerocash provides such privacy protection, by hiding user identities, trans- action amounts, and account balances from public view. This, however, may be criticized for hampering accountability, regulation, and oversight. Yet Zerocash need not be limited to enforcing the basic monetary invariants of a currency system. The underlying zk-SNARK cryptographic proof machinery is flexible enough to support a wide range of policies. It can, for example, let a user prove that he paid his due taxes on all transactions without revealing those transactions, their amounts, or even the amount of taxes paid. As long as the policy can be specified by efficient nondeterministic computation using NP statements, it can (in principle) be enforced using zk-SNARKs, and added to Zerocash. This can enable automated, privacy-preserving verification and enforcement of a wide range of compliance and regulatory policies that would otherwise be invasive to check directly or might be bypassed by corrupt authorities. This raises research, policy, and engineering questions regarding which such policies are desirable and practically realizable.

Another research question is what new functionality can be realized by augmenting the capabilities already present in Bitcoin’s scripting language with zk-SNARKs that allow fast verification of expressive statements.

Acknowledgments

We thank Amazon for their assistance and kind donation of EC2 resources, and Gregory Maxwell for his advice regarding the Bitcoin codebase. We thank Iddo Ben-Tov and the SCIPR Lab members

— Daniel Genkin, Lior Greenblatt, Shaul Kfir, Gil Timnat, and Michael Riabzev — for inspiring discussions. We thank Sharon Kessler for editorial advice.

This work was supported by: Amazon.com through an AWS in Education research grant; the Broadcom Foundation and Tel Aviv University Authentication Initiative; the Center for Science of Information (CSoI), an NSF Science and Technology Center, under grant agreement CCF-0939370; the Check Point Institute for Information Security; the U.S. Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL) under contract FA8750-11-2-0211; the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement number 240258; the Israeli Centers of Research Excellence I-CORE program (center 4/11); the Israeli Ministry of Science and Technology; the Office of Naval Research under contract N00014-11-1-0470; the Simons Foundation, with a Simons Award for Graduate Students in Theoretical Computer Science; and the Skolkovo Foundation with agreement dated 10/26/2011.

The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S. Government.

Overview of Bitcoin and Zerocoin

We provide an overview of the Bitcoin and Zerocoin protocols. For more details, we refer the reader to Nakamoto [Nak09] and Miers et al. [MGGR13] respectively.

Bitcoin

Bitcoin [Nak09] is a decentralized currency operated by a collection of mutually-distrusting peers. It consists of three basic components: (i) a peer-to-peer network for broadcasting new transactions;

  1. semantics for identifying and validating new transactions; and (iii) a protocol for maintaining a decentralized ledger, known as the block chain, that stores the history of all valid transactions so far. Identities in Bitcoin are represented via ECDSA public keys. Each user u generates an ECDSA key pair (vku, sku) and, to receive payments, publishes the verification key vku (or its hash) as an address. (In fact, there is no limit to the number of addresses that an individual user may possess.)

Transactions. A transaction tx represents a payment from a list of input transactions to a list of output recipients. More precisely, tx is specified by a list {Ij}j of inputs and a list {Oj}j of outputs. Each output Oj specifies a value vj, denominated in Satoshi (10 Satoshi amounts to 1 bitcoin), and a recipient specification rj, called ScriptPubKey. The specification rj is given in Bitcoin script, a stack-based non-Turing-complete language similar to Forth, and specifies the identity of the recipient of the vj Satoshi. Each input Ij references an output of a previous transaction txj: the reference is specified by a tuple (hj, kj, σj), where hj is the hash of txj, kj is an index specifying which output of txj is referenced, and σj, called ScriptSig, is a an input satisfying the ScriptPubKey of the kj-th output of txj. Typically, the ScriptPubKey specifies a public key that must sign the transaction spending the output and σj contains such a signature, hence their names. Inputs can only be claimed by one transaction to prevent double spending.

Σ

9

The total number of bitcoins output by a transaction, j vj, cannot exceed the total value of the referenced outputs. Any difference between these two quantities is claimed as a transaction fee (see below). Thus, any unspent inputs to a transaction become a fee, and transactions typically

have at least two outputs: one to the payment’s recipient and one back to the sender as “change”.

The block chain. Transactions are broadcast in the Bitcoin peer-to-peer network, but are considered valid only once they have been added to the the block chain. To assemble the block chain, miners (usually but not necessarily, network nodes) collect transactions from the Bitcoin network and bundle them into blocks. Miners then compete for the opportunity to append their own candidate block B to the block chain by searching for a string s such that the integer specified by SHA256(SHA256(Bs)) is below some threshold. To incentivize block creation, miners receive a protocol-specified reward (currently 25 BTC) for adding a new block and, moreover, receive per-transaction fees (whose value is specified by the transaction’s creator).

The proof of work protects a block against tampering and also ensures that meaningful compu- tational resources were devoted to finding it. This prevents a sybil attack since all the sybils share the same total computational resources (e.g., the server they are virtualized on). Bitcoin assumes that provided more than half the computational work is held by honest nodes, the block-chain is secure. (Though recent work [ES13] has suggested that the threshold may be larger than 50%.)

Zerocoin

Zerocoin extends Bitcoin by creating two new transaction types: mint and spend. A mint transaction allows a user to exchange a quantity of bitcoins for the right to mint a new zerocoin. Each zerocoin consists of a digital commitment cm to a random serial number sn. At a later point, a (potentially

different) user may issue a spend transaction containing a destination identity, the serial number sn, and a non-interactive zero-knowledge proof for the NP statement “I know secret cm and r such that (i) cm can be opened to sn with commitment randomness r, and (ii) cm was previously minted at some point in the past”. Crucially, the proof, being zero knowledge, does not link the spend transaction to any particular mint transaction (among all mint transactions so far). If the proof verifies correctly and the serial number has not been spent previously, the protocol semantics transfer a corresponding amount of bitcoins to the destination address. In this fashion, Zerocoin functions as a decentralized mix.

Zerocoin uses Pedersen commitments over a prime field Fp, i.e., cm := gsnhr, for random generators g, h of a subgroup of F∗p. The corresponding zero-knowledge proofs are constructed by first accumulating (via the Strong-RSA accumulator of [CL01]) the set of commitments of all

minted zerocoins, and then proving knowledge of the corresponding commitment randomness and membership in this set. For technical reasons, the proof requires a double-discrete-logarithm (DDL) Fiat–Shamir proof of size ≈ |p|λ, where λ is the security parameter. In practice, the size of these proofs exceeds 45 kB at the 128-bit security level, and require 450 ms or more to verify.

Also note that, in Zerocoin, computing the witness for the accumulator requires access to the entire set of commitments so far (though the witness can be incrementally updated for each insertion). This technique supports an unlimited number of coins. In contrast, our construction places a cap N on the number of coins (in our implementation, N = 264) but needs only log N updates to issue N new coins (and these updates can be efficiently batched, cf. Section 8.3.1).

Completeness of DAP schemes

A DAP scheme Π = (Setup, CreateAddress, Mint, Pour, VerifyTransaction, Receive) is complete if no polynomial-size ledger sampler S can win the incompleteness experiment with more than negligible probability. In Section 3.4 we informally described this property; we now formally define it.

Deftnition B.1. Let Π = (Setup, CreateAddress, Mint, Pour, VerifyTransaction, Receive) be a (candi- date) DAP scheme. We say that Π is complete if, for every poly(λ)-size ledger sampler S and sufficiently large λ,

AdvINCOMP(λ) < negl(λ) ,

Π,S

where AdvINCOMP(λ) := Pr[INCOMP(Π, S, λ) = 1] is S’s advantage in the incompleteness experiment.

Π,S

We now describe the incompleteness experiment mentioned above. Given a (candidate) DAP scheme Π, a ledger sampler S, and a security parameter λ, the (probabilistic) experiment INCOMP(Π, S, λ) consists of an interaction between S and a challenger C, terminating with a binary output by C.

At the beginning of the experiment, C samples pp ← Setup(1λ) and sends pp to S. Then, S

sends a ledger, two coins to be spent, and parameters for a pour transaction; more precisely, sends (1) a ledger L; (2) two coins cold, cold; (3) two address secret keys addrold , addrold ; (4) two

C S

1 2 sk,1 sk,2

values vnew, vnew; (5) new address key pairs (addrnew , addrnew), (addrnew , addrnew); (6) a public value

1 2 pk,1

sk,1

pk,2

sk,2

vpub; and (7) a transaction string info. Afterwards, C performs various checks on S’s message.

Concretely, C first checks that cold and cold are valid unspent coins, i.e., checks that: (i) cold

1

2

1

and cold are well-formed; (ii) their coin commitments cmold and cmold appear in (valid) transactions

2 1 2

on L; (iii) their serial numbers snold and snold do not appear in (valid) transactions on L. Next, C

1

2

checks that vnew + vnew + vpub = vold + vold (i.e., the values suggested by S preserve balance) and

1

2

1

2

vold + voldvmax (i.e., the maximum value is not exceeded). If any of these checks fail, C aborts

1

2

and outputs 0.

Otherwise, C computes rt, the Merkle-tree root over all coin commitments in L (appearing in valid transactions), and, for i ∈ {1, 2, }, pathi, the authentication path from commitment cmold to

i

the root rt. Then, C attempts to spend cold, cold as instructed by S:

1

2

(cnew, cnew, txPour) ← Pour(pp, rt, cold, cold, addrold , addrold , path1, path2, vnew, vnew, addrnew , addrnew , vpub, info) .

1

2

1

2

sk,1

sk,2

1

2

pk,1

pk,2

Finally, C outputs 1 if and only if any of the following conditions hold:

    • txPour =ƒ (rt, snold, snold, cmnew, cmnew, vpub, info, ∗), where cmnew, cmnew are the coin commitments

1 2 1 2 1 2

of cnew, cnew; OR

1 2

    • txPour is not valid, i.e., VerifyTransaction(pp, txPour, L) outputs 0; OR
    • for some i ∈ {1, 2}, the coin cnew is not returned by Receive(pp, (addrnew, addrnew), Lj), where Lj

i

is the ledger obtained by appending txPour to L.

pk,i

sk,i

Remark. There is no need for the challenger C check that, in turn, both cnew and cnew are spendable,

1

2

because this follows by induction. Namely, if cnew, cnew were not spendable, a different sampler Sj

1

2

(that simulates S and then computes and outputs cnew and cnew) would provide a counterexample

1 2

to the above definition.

Security of DAP schemes

A DAP scheme Π = (Setup, CreateAddress, Mint, Pour, VerifyTransaction, Receive) is secure if it satis- fies ledger indistinguishability, transaction non-malleability, and balance. (See Definition 3.2.) In Section 3.4 we informally described these three properties; we now formally define them.

Each of the definitions employs an experiment involving a (stateful) DAP oracle ODAP that

receives and answers queries from an adversary A (proxied via a challenger C, which performs the experiment-specific sanity checks). Below, we first describe how ODAP works.

The oracle ODAP is initialized by a list of public parameters pp and maintains state. Internally, ODAP stores: (i) L, a ledger; (ii) ADDR, a set of address key pairs; (iii) COIN, a set of coins. All of L, ADDR, COIN start out empty. The oracle ODAP accepts different types of queries, and each query causes different updates to L, ADDR, COIN and outputs. We now describe each type of query Q.

    • Q = (CreateAddress)

1.Compute ( addrpk, addrsk) := CreateAddress(pp). 2.Add the address key pair ( addrpk, addrsk) to ADDR. 3.Output the address public key addrpk.

The ledger L and coin set COIN remain unchanged.

    • Q = (Mint, v, addrpk)

1.Compute ( c, txMint) := Mint(pp, v, addrpk). 2.Add the coin c to COIN.

3.Add the mint transaction txMint to L. 4.Output ⊥.

The address set ADDR remains unchanged.

    • Q = (Pour, idxold, idxold, addrold , addrold , info, vnew, vnew, addrnew , addrnew , vpub)

1

2

pk,1

pk,2

1

2

pk,1

pk,2

  1. Compute rt, the root of a Merkle tree over all coin commitments in L. 2.For each i 1, 2 :

∈ { }

    1. Let cmold be the idxold-th coin commitment in L.

i i

    1. Let txi be the mint/pour transaction in L that contains cmold. (c)Let cold be the first coin in COIN with coin commitment cmold.

i

i i

  1. Let ( addrold , addrold ) be the first key pair in ADDR with addrold being cold’s address.

pk,i

sk,i

pk,i i

  1. Compute pathi, the authentication path from cmold to rt.

i

  1. Compute (cnew, cnew, txPour) := Pour(pp, rt, cold, cold, addrold , addrold , path1, path2, vnew, vnew,

1 2 1 2

sk,1

sk,2 1 2

addrnew , addrnew , vpub, info).

pk,1 pk,2

  1. Verify that VerifyTransaction(pp, txPour, L) outputs 1.
  2. Add the coin cnew to COIN. 6.Add the coin cnew to COIN.

1

2

7.Add the pour transaction txPour to L. 8.Output ⊥.

If any of the above operations fail, the output is ⊥ (and L, ADDR, COIN remain unchanged).

    • Q = (Receive, addrpk)

1.Look up ( addrpk, addrsk) in ADDR. (If no such key pair is found, abort.) 2.Compute ( c1, . . . , cn) ← Receive(pp, (addrpk, addrsk), L).

  1. Add c1, . . . , cn to COIN.
  2. Output ( cm1, . . . , cmn), the corresponding coin commitments. The ledger L and address set ADDR remain unchanged.
    • Q = (Insert, tx)

1.Verify that VerifyTransaction(pp, tx, L) outputs 1. (Else, abort.) 2.Add the mint/pour transaction tx to L.

  1. Run Receive for all addresses addrpk in ADDR; this updates the COIN with any coins that might have been sent to honest parties via tx.
  2. Output ⊥.

The address set ADDR remains unchanged.

Remark. The oracle ODAP provides A with two ways to cause a pour transaction to be added to L. If A has already obtained address public keys addrpk,1 and addrpk,2 (via previous CreateAddress queries), then A can use a Pour query to elicit a pour transaction txPour (despite not knowing address secret keys addrsk,1, addrsk,2 corresponding to addrpk,1, addrpk,2). Alternatively, if A has himself generated both address public keys, then A knows corresponding address secret keys, and can invoke Pour “in his head” to obtain a pour transaction txPour, which he can add to L by using an Insert query. In the first case, both addresses belong to honest users; in the second, both to A. But what about pour transactions where one address belongs to an honest user and one to A?

Such pour transactions might arise from MPC computations (e.g., to make matching donations). The ledger oracle ODAP, as defined above, does not support such queries. While extending the definition is straightforward, for simplicity we leave handling such queries to future work.

Ledger indistinguishability

Ledger indistinguishability is characterized by an experiment L-IND, which involves a polynomial-size adversary A attempting to break a given (candidate) DAP scheme.

Deftnition C.1. Let Π = (Setup, CreateAddress, Mint, Pour, VerifyTransaction, Receive) be a (candi- date) DAP scheme. We say that Π is L-IND secure if, for every poly(λ)-size adversary A and sufficiently large λ,

AdvL-IND(λ) < negl(λ) ,

Π,A

where AdvL-IND(λ) := 2 · Pr[L-IND(Π, A, λ) = 1] − 1 is A’s advantage in the L-IND experiment.

Π,A

We now describe the L-IND experiment mentioned above. Given a (candidate) DAP scheme Π, adversary A, and security parameter λ, the (probabilistic) experiment L-IND(Π, A, λ) consists of an interaction between A and a challenger C, terminating with a binary output by C.

At the beginning of the experiment, C samples b ∈ {0, 1} at random, samples pp ← Setup(1λ),

and sends pp to A; next, C initializes (using pp) two separate DAP oracles ODAP and ODAP (i.e.,

0 1

the two oracles have separate ledgers and internal tables).

The experiment proceeds in steps and, at each step, C provides to A two ledgers (LLeft, LRight), where LLeft := Lb is the current ledger in ODAP and LRight := L1−b the one in ODAP; then A sends

j b 1−b

to C a pair of queries (Q, Q ), which must be of the same type (i.e., one of CreateAddress, Mint,

Pour, Receive, Insert). The challenger C acts differently depending on the query type, as follows.

    • If the query type is Insert, C forwards Q to ODAP, and Qj to ODAP. This allows A to insert his

b

own transactions directly in LLeft and LRight.

1−b

    • For any other query type, C first ensures that Q, Qj are publicly consistent (see below) and then forwards Q to ODAP, and Qj to ODAP; letting (a0, a1) be the two oracle answers, C replies to A

0

1

with (ab, a1−b). This allows A to elicit behavior from honest users. However note that A does not know the bit b, and hence the mapping between (LLeft, LRight) and (L0, L1); in other words, A does not know if he elicits behavior on (L0, L1) or on (L1, L0).

At the end of the experiment, A sends C a guess bj ∈ {0, 1}. If b = bj, C outputs 1; else, C outputs 0.

Public consistency. As mentioned above, A sends C pairs of queries (Q, Qj), which must be of the same type and publicly consistent, a property that we now define. If Q, Q are both of type CreateAddress or Receive, then they are always publicly consistent. In the special case of CreateAddress we require that both oracles generate the same address. If they are both of type Mint, then the minted value in Q must equal that in Qj. Finally, if they are both of type Pour, the following restrictions apply.

j

First, Q, Qj must be individually well-formed; namely, (i) the coin commitments referenced by

Q (via the two indices idxold, idxold) must correspond to coins cold, cold that appear in the ledger

1 2 1 2

oracle’s coin table COIN; (ii) the two coins cold, cold must be unspent (i.e. their serial numbers must

1 2

not appear in a valid pour transactions on the corresponding oracle’s ledger); (iii) the address public

keys specified in Q must match those in cold, cold; and (iv) the balance equation must hold (i.e.,

1 2

vnew + vnew + vpub = vold + vold).

1 2 1 2

Furthermore, Q, Qj must be jointly consistent with respect to public information and ’s view; namely: (i) the public values in Q and Qj must equal; (ii) the transaction strings in Q and Qj must equal; (iii) for each i ∈ {1, 2}, if the i-th recipient addresses in Q is not in ADDR (i.e., belongs to A) then vnew in both Q and Qj must equal (and vice versa for Qj); and (iv) for each i ∈ {1, 2}, if the i-th index in Q references (in L0) a coin commitment contained in a transaction that was posted via an Insert query, then the corresponding index in Qj must reference (in L1) a coin commitment that also appears in a transaction posted via an Insert query and, moreover, vold in both Q and Qj must equal (and vice versa for Qj). The challenger C learns vold by looking-up the corresponding coin cold

A

i

i

i

i

in the oracle’s coin set COIN. (v) for each i ∈ {1, 2} if the i-th index in Q must not reference a coin that has previously been spent.

Transaction non-malleability

Transaction non-malleability is characterized by an experiment TR-NM, which involves a polynomial- size adversary A attempting to break a given (candidate) DAP scheme.

Deftnition C.2. Let Π = (Setup, CreateAddress, Mint, Pour, VerifyTransaction, Receive) be a (candi- date) DAP scheme. We say that Π is TR-NM secure if, for every poly(λ)-size adversary A and sufficiently large λ,

AdvTR-NM(λ) < negl(λ) ,

Π,A

where AdvTR-NM(λ) := Pr[TR-NM(Π, A, λ) = 1] is A’s advantage in the TR-NM experiment.

Π,A

We now describe the TR-NM experiment mentioned above. Given a (candidate) DAP scheme Π, adversary A, and security parameter λ, the (probabilistic) experiment TR-NM(Π, A, λ) consists of an interaction between A and a challenger C, terminating with a binary output by C.

At the beginning of the experiment, C samples pp ← Setup(1λ) and sends pp to A; next, C initializes a DAP oracle ODAP with pp and allows A to issue queries to ODAP. At the end of the experiment, A sends C a pour transaction tx, and C outputs 1 if and only if the following conditions hold. Letting T be the set of pour transactions generated by ODAP in response to Pour queries,

there exists tx ∈ T such that: (i) tx = tx; (ii) VerifyTransaction(pp, tx, Lj) = 1, where Lj is the

portion of the ledger preceding tx;27 and (iii) a serial number revealed in tx is also revealed in tx.

Balance

Balance is characterized by an experiment BAL, which involves a polynomial-size adversary A

attempting to break a given (candidate) DAP scheme.

Deftnition C.3. Let Π = (Setup, CreateAddress, Mint, Pour, VerifyTransaction, Receive) be a (can- didate) DAP scheme. We say that Π is BAL secure if, for every poly(λ)-size adversary A and sufficiently large λ,

AdvBAL(λ) < negl(λ) ,

Π,A

where AdvBAL(λ) := Pr[BAL(Π, A, λ) = 1] is A’s advantage in the BAL experiment.

Π,A

We now describe the BAL experiment mentioned above. Given a (candidate) DAP scheme Π, adversary A, and security parameter λ, the (probabilistic) experiment BAL(Π, A, λ) consists of an interaction between A and a challenger C, terminating with a binary output by C.

At the beginning of the experiment, C samples pp ← Setup(1λ), and sends pp to A; next, C (using pp) initializes a DAP oracle ODAP and allows A to issue queries to ODAP. At the conclusion of the experiment, A sends C a set of coins Scoin. Recalling that ADDR is the set of addresses returned by CreateAddress queries (i.e., addresses of “honest” users), C computes the following five quantities.

    • vUnspent, the total value of all spendable coins in Scoin. The challenger C can check if a coin c Scoin is spendable as follows: mint a fresh coin c of value 0 (via a Mint query) and check if a corresponding Pour query consuming c, cj yields a pour transaction txPour that is valid.

j

    • vMint, the total value of all coins minted by A. To compute vMint, the challenger C sums up the values of all coins that (i) were minted via Mint queries using addresses not in ADDR, or
    1. whose mint transactions were directly placed on the ledger via Insert queries.
    • vADDR→A, the total value payments received by A from addresses in ADDR. To compute vADDR→A, the challenger C looks up all pour transactions placed on the ledger via Pour queries and sums up the values that were transferred to addresses not in ADDR.

27That is, Lj is the longest ledger prefix that can be used to spend at least one of the coins spent in tx.

    • vA→ADDR, the total value of payments sent by A to addresses in ADDR. To compute vA→ADDR, the challenger C first deduces the set Sj ⊆ COIN of all coins received by honest parties and then sums up the values of coins in Sj. (Note that C can compute Sj by selecting all coins in COIN that are both tied to an address in ADDR and arose from transactions placed on the ledger by Insert queries.)
    • vBasecoin, the total value of public outputs placed by A on the ledger. To compute vBasecoin, the challenger C looks up all pour transactions placed on the ledger by Insert and sums up the corresponding vpub values.

At the end of the experiment, C outputs 1 if vUnspent + vBasecoin + vA→ADDR > vMint + vADDR→A; else,

C outputs 0.

Remark. There are two methods for A to spend more public-output money than he owns: (i) by directly inserting transactions on the ledger, and (ii) by asking honest parties to create such transactions. The first method is accounted for in the computation of vBasecoin, while the second method is accounted for in the computation of vA→ADDR (since A must first pay the honest party).

Proof of Theorem 4.1

We prove Theorem 4.1. We omit a formal proof of the completeness claim; one can verify that the DAP scheme’s completeness follows, in a straightforward way, from the completeness of the construction’s building blocks. Next, we argue security via three separate proofs, respectively showing that our construction satisfies (i) ledger indistinguishability, (ii) transaction non-malleability, and

    1. balance.

Proof of ledger indistinguishability

We describe a simulation 3sim in which the adversary A interacts with a challenger C, as in the L-IND experiment. However 3sim differs from the L-IND experiment in a critical way: all answers sent to A are computed independently of the bit b, so that A’s advantage in 3sim is 0. The remainder of the proof is devoted to showing that AdvL-IND(λ) (i.e., A’s advantage in the L-IND experiment) is at most negligibly different than A’s advantage in 3sim.

Π,A

The simulation. The simulation 3sim works as follows. First, after sampling b ∈ {0, 1} at random, C samples pp ← Setup(1λ), with the following modification: the zk-SNARK keys are generated as (pkPOUR, vkPOUR, trap) ← Sim(1λ, CPOUR), to obtain the zero-knowledge trapdoor trap. Then, as in the L-IND experiment, C sends pp to A, and then initializes two separate DAP oracles ODAP and ODAP.

0

1

Afterwards, as in L-IND, 3sim proceeds in steps and, at each step, C provides to A two ledgers (LLeft, LRight), where LLeft := Lb is the current ledger in ODAP and LRight := L1−b the one in ODAP;

j b 1−b

then A sends to C a message (Q, Q ), which consist of two (publicly-consistent) queries of the same

type. The challenger C acts differently depending on the query type, as follows.

    • Answering CreateAddress queries. In this case, Q = Qj = CreateAddress.

To answer Q, C behaves as in L-IND, except for the following modification: after obtaining (addrpk, addrsk) ← CreateAddress(pp), C replaces apk in addrpk with a random string of the appro- priate length; then, C stores addrsk in a table and returns addrpk to A.

Afterwards, C does the same for Qj.

    • Answering Mint queries. In this case, Q = (Mint, v, addrpk) and Qj = (Mint, v, addrjpk).

To answer Q, C behaves as in L-IND, except for the following modification: the Mint algorithm computes the commitment k as COMMr(τ ρ), for a random string τ of the appropriate length, instead of as COMMr(apkρ), where apk is the value specified in addrpk.

Afterwards, C does the same for Qj.

    • Answering Pour queries. In this case, Q and Qj both have the form (Pour, cmold, cmold, addrold ,

addrold , info, vnew, vnew, addrnew , addrnew , vnew).

1 2 pk,1

pk,2 1 2

pk,1

pk,2

pub

To answer Q, C modifies the way some values are computed:

      1. Compute rti by accumulating all of the valid coin commitments on Li. 2.Set vpub and info to the corresponding input values.
  1. For each j 1, 2 :

∈ { }

    1. Sample a uniformly random snold.

j

    1. If addrnew

pk,j

is an address generated by a previous query to CreateAddress, (i) sample

a coin commitment cmnew on a random input, (ii) run Kenc(ppenc) → (pkenc, skenc) and compute Cnew := Eenc(pkenc, r) for a random r of suitable length.

j

j

    1. Otherwise, calculate ( cmnew, Cnew) as in the Pour algorithm.28

i i

  1. Set h1 and h2 to be random strings of the appropriate length.
  2. Compute all remaining values as in the Pour algorithm
  3. The pour proof is computed as πPOUR := Sim(trap, x), where x := (rt, snold, snold, cmnew, cmnew, vpub, h1, h2).

Afterwards, C does the same for Qj.

1 2 1 2

    • Answering Receive queries. In this case, Q = (Receive, addrpk) and Qj = (Receive, addrjpk). The answer to each query proceeds as in the L-IND experiment.
    • Answering Insert queries. In this case, Q = (Insert, tx) and Q = (Insert, txj). The answer to each query proceeds as in the L-IND experiment.

In each of the above cases, the response to A is computed independently of the bit b. Thus, when

A outputs a guess bj, it must be the case that Pr [ b = bj ] = 1/2, i.e., A’s advantage in 3sim is 0.

Proof that the simulation is indistinguishable from the real experiment. We now describe a sequence of hybrid experiments (3real, 31, 32, 33, 3sim) in each of which a challenger C conducts a modification of the L-IND experiment with A. We define 3real to be the original L-IND experiment, and 3sim to be the simulation described above.

With a slight abuse of notation, given experiment 3, we define Adv3 to be the absolute value of the difference between (i) the L-IND advantage of A in 3 and (ii) the L-IND advantage of A in 3real. Also, let

      • qCA be the total number of CreateAddress queries issued by A,
      • qP be the total number of Pour queries issued by A, and

qM be the total number of Mint queries issued by .

  • A

Finally, define AdvEnc to be ’s advantage in Enc’s IND-CCA and IK-CCA experiments, AdvPRF to be

A

COMM

A’s advantage in distinguishing the pseudorandom function PRF from a random one, and Adv

to be A’s advantage against the hiding property of COMM. We now describe each of the hybrid experiments.

i

28Note that by the restrictions of the experiment, the value vnew is identical between QLeft and QRight.

    • Experiment 31. The experiment 31 modifies 3real by simulating the zk-SNARKs. More precisely, we modify 3real so that C simulates each zk-SNARK proof, as follows. At the beginning of the experiment, instead of invoking KeyGen(1λ, CPOUR), C invokes Sim(1λ, CPOUR) and obtains (pkPOUR, vkPOUR, trap). At each subsequent invocation of the Pour algorithm, C computes πPOUR ← Sim(trap, x), without using any witnesses, instead of using Prove. Since the zk-SNARK system is perfect zero knowledge, the distribution of the simulated πPOUR is identical to that of the proofs computed in 3real. Hence Adv31 = 0.
    • Experiment 32. The experiment 32 modifies 31 by replacing the ciphertexts in a pour transaction by encryptions of random strings. More precisely, we modify 31 so that, each time issues a Pour query where one of the output addresses (addrnew , addrnew ) is in the set of addresses

A

pk,1 pk,2

previously generated by a CreateAddress query, the two ciphertexts Cnew, Cnew are generated

1 2

as follows: (i) (pknew, sknew) ← Kenc(ppenc); (ii) for each j ∈ {1, 2}, Cnew := Eenc(pknew , r) where r

enc

enc

j

enc,j

is a message sampled uniformly from the plaintext space of the encryption scheme. By Lemma D.1

(see below), |Adv32 − Adv31 | ≤ 4 · qP · AdvEnc.

    • Experiment 33. The experiment 33 modifies 32 by replacing all PRF-generated values with random strings. More precisely, we modify 32 so that:
  • each time A issues a CreateAddress query, the value apk within the returned addrpk is substituted with a random string of the same length;
  • each time A issues a Pour query, each of the serial numbers snold, snold in txPour is substituted

1

2

with a random string of the same length, and hinfo with a random string of the same length.

By Lemma D.2 (see below), |Adv33 − Adv32 | ≤ qCA · AdvPRF.

    • Experiment 3sim. The experiment 3sim is already described above. For comparison, we explain how it differs from 33: the coin commitments are replaced with commitments to random inputs. More precisely, we modify 33 so that:
  • each time A issues a Mint query, the coin commitment cm in txMint is substituted with a commitment to a random input; and
  • each time A issues a Pour query, then, for each j ∈ {1, 2}, if the output address addrnew is in the set of addresses previously generated by an CreateAddress query, cm is substituted

j

new

pk,j

with a commitment to a random input.

By Lemma D.3 (see below), |Adv3sim − Adv33 | ≤ (qM + 4 · qP) · AdvCOMM.

As argued above, the responses provided to A in 3sim are independent of the bit b, so that Adv3sim = 0. Then, by summing over A’s advantages in the hybrid experiments, we can bound A’s advantage in 3real by

AdvL-IND(λ) ≤ 4 · qP · AdvEnc + qCA · AdvPRF + (qM + 4 · qP) · AdvCOMM ,

Π,A

which is negligible in λ. This concludes the proof of ledger indistinguishability. Below, we sketch proofs for the lemmas used above (Lemma D.1, Lemma D.2, and Lemma D.3).

Lemma D.1. Let AdvEnc be the maximum of:

      • A ’s advantage in the IND-CCA experiment against the encryption scheme Enc, and
      • A ’s advantage in the IK-CCA experiment against the encryption scheme Enc. Then after qP Pour queries, |Adv32 − Adv31 | ≤ 4 · qP · AdvEnc.

Proof sketch. Define s := Adv32 − Adv31 . Using A, we first show how to construct a solver with advantage ≥ s in the IK-CCA or IND-CCA experiments. We use a hybrid H, intermediate between

·

31 and 3

2 qP

2; concretely, H modifies 31

so that each ciphertext (where the corresponding public key

appears in the set generated by a CreateAddress query) is replaced with the encryption of the same plaintext, but under a new, random public key generated via the Kenc algorithm. (For comparison, 32 modifies H so that each plaintext is replaced with a random plaintext drawn from the plaintext space.) We now argue that A’s advantage in distinguishing H and 31 is at most 2 · qP · AdvEnc, and so is for distinguishing 32 and H. Overall, we deduce that |Adv32 − Adv31 | ≤ 4 · qP · AdvEnc.

First, we discuss H and 31. For some j ∈ {1, . . . , qCA}, when A makes the j-th query of the form CreateAddress, query the IK-CCA challenger to obtain two public keys (pkenc,0, pkenc,1) and return pkenc := pkenc,0 in the response to A. At the time A issues a Pour query that results in the

  1. th ciphertext Ci being encrypted under pkenc, query the IK-CCA challenger on the corresponding plaintext m and receive C = Eenc(pkenc,¯b, m) where ¯b is the bit chosen by the IK-CCA challenger. Substitute Ci := C and write the resulting txPour to the Ledger. When A outputs bj we return this guess as our guess in the IK-CCA experiment. We note that when ¯b = 0 then A’s view of the interaction is distributed identically to that of 31, and when ¯b is 1 then A’s view represents an

intermediate hybrid where one key has been substituted. By a standard hybrid argument over each

of the 2 qP ciphertexts, we note that over the random coins of the experiment, our solver must succeed in the IK-CCA experiment with advantage s . If we assume a maximum adversarial

·

qP

advantage AdvEnc against the IK-CCA experiment for the encryption scheme, then we get that

H 32 Enc

. .Adv − Adv ≤ 2 · q · Adv .

P

Next, we discuss 32 and H; the argument is similar to the above one. This time, rather than replacing the key used to encrypt, we replace the plaintext with a random message drawn from the plaintext space; this final distribution is the same as in 32. We omit the formal description of the resulting IND-CCA solver (which essentially follows the pattern above), and simply note that

3 H Enc2

.Adv − Adv . ≤ 2 · qP · Adv .

Lemma D.2. Let AdvPRF be A’s advantage in distinguishing the pseudorandom function PRF from a random function. Then, after qCA CreateAddress queries, |Adv33 − Adv32 | ≤ qCA · AdvPRF.

Proof sketch. We first describe a hybrid H, intermediate between 32 and 33, in which all values computed using the first (rather than all) oracle-generated key ask are replaced with random strings. Then, we show that A’s advantage in distinguishing between H and 32 is at most AdvPRF. Finally, we extend the argument to all qCA oracle-generated keys (corresponding to what happens in 33).

We now describe H. On receiving A’s first CreateAddress query, replace the public address

addrpk = (apk, pkenc) with addrpk = (τ, pkenc) where τ is a random string of the appropriate length.

On each subsequent Pour query, for each i ∈ {1, 2}, if addrold = addrpk then:

pk,i

    1. in the output txPour, replace snold with a random string of appropriate length;

i

    1. in the output txPour, replace each of h1, h2 with a random string of appropriate length. 3.simulate the zk-SNARK proof πPOUR for the new transaction.

Note that the above modifications do not affect the computation of the zk-SNARK proof πPOUR,

because πPOUR is simulated with the help of a trapdoor.

We now argue that A’s advantage in distinguishing between H and 32 is at most AdvPRF. Let ask be the random, secret seed for PRF generated by the oracle in answering the first CreateAddress query. In 32 (as in 3real):

addr

  • apk := PRFask (0);

sn

  • for each i ∈ {1, 2}, sni := PRFask (ρ) for a random (and not previously used) ρ
  • for each i ∈ {1, 2}, hi := PRFask (ihSig) and, with overwhelming probability, hSig is unique.

pk

Moreover, each of PRFaddr, PRFsn , PRFpk are constructed from PRFa as specified in Section 4.1.

ask

ask

ask sk

Note that, with overwhelming probability, no two calls to PRFask are made on the same input. First,

even identical inputs passed to PRFaddr, PRFsn , PRFpk produce different underlying calls to PRFa .

ask

ask

ask sk

Second, within each construction, there is exactly one call to PRFaddr, and the calls to PRFsn are

ask ask

each by definition unique. Finally, with overwhelming probability, the calls to PRFpk from different

ask

transactions each reference a distinct digest hSig, and, within a given transaction, the two calls each begin with a distinct prefix.

Now let O be an oracle that implements either PRFask or a random function. We show that if A distinguishes H from 32 with probability s, then we can construct a distinguisher for the two cases of O. In either case we use O to generate all values computed using PRFaddr, PRFsn , PRFpk .

ask

ask

ask

Clearly, when O implements PRFask , the distribution of the experiment is identical to 32; instead, when implements a random function, the distribution of the experiment is identical to H. Thus,

O

PRF

A’s advantage is at most Adv .

Finally, by a standard hybrid argument, we extend the above to all qCA oracle-generated addresses; then, A’s differential distinguishing advantage is at most qCA · AdvPRF. Because this final hybrid is equal to 33, we deduce that |Adv33 − Adv32 | ≤ qCA · AdvPRF.

Lemma D.3. Let AdvCOMM be A’s advantage against the hiding property of COMM. After qM

Mint queries and qP Pour queries, |Adv3sim − Adv33 | ≤ (qM + 4 · qP) · AdvCOMM.

Proof sketch. We only provide a short sketch, because the structure of the argument is similar to the one used to prove Lemma D.2 above.

For the first Mint or Pour query, replace the “internal” commitment k := COMMr(apkρ) with a random string of the appropriate length. Since ρ is random (and unique), then A’s advantage in distinguishing this modified experiment from 32 is at most AdvCOMM. Then, if we similarly modify all qM Mint queries and all qP Pour queries, by replacing the resulting qM + 2 · qP internal commitments with random strings, we can bound A’s advantage by (qM + 2 · qP) · AdvCOMM.

Next, in a similar vein, if replace the coin commitment in the first Pour with a commitment to a random value, then ’s advantage in distinguishing this modified experiment from the above one is at most AdvCOMM. Then, if we similarly modify all qP Pour queries, by replacing the resulting 2 · qP coin commitments with random strings, we obtain the experiment 3sim, and deduce that

A

|Adv3sim − Adv33 | ≤ (qM + 4 · qP) · AdvCOMM.

Proof of transaction non-malleability

Letting be the set of pour transactions generated by DAP in response to Pour queries, recall that wins the TR-NM experiment whenever it outputs tx such that there exists txj such that:

A ∈ T

T O

  1. tx = txj; (ii) VerifyTransaction(pp, tx, Lj) = 1, where Lj is the portion of the ledger preceding txj; and (iii) a serial number revealed in tx is also revealed in txj. Being a pour transaction, tx has the form (rt, snold, snold, cmnew, cmnew, vpub, info, ∗), where ∗ := (pksig, h1, h2, πPOUR, C1, C2, σ); set

ƒ

1

2

1

2

hSig := CRH(pksig). Let pkjsig be the corresponding public key in txj and set hjSig := CRH(pkjsig).

Define s := AdvTR-NM(λ), and let QCA = {ask,1, . . . , ask,q } be the set of internal address keys

Π,A

CA

created by C in response to A’s CreateAddress queries. Let QP = (pksig,1, . . . , pksig,qP ) be the set of signature public keys created by C in response to A’s Pour queries. We decompose the event in which A wins into the following four disjoint events.

  • Eventsig: A wins, and there is pkjsjig ∈ QP such that pksig = pkjsjig.
  • Eventcol: A wins, the above event does not occur, and there is pkjsjig ∈ QP such that hSig =

CRH(pkjsjig).

  • Eventmac: A wins, the above two events do not occur, and hi = PRFa (ihSig) for some i ∈ {1, 2} and a ∈ QCA.

pk

  • Eventkey: A wins, the above three events do not occur, and hi

a

and a ∈ QCA.

PRFpk(ihSig) for all i ∈ {1, 2}

Clearly, s = Pr [ Eventsig ] + Pr [ Eventcol ] + Pr [ Eventkey ] + Pr [ Eventmac ]. Hence, to show that s is negligible in λ, it suffices to argue that each of these probabilities is negligible in λ.

Bounding the probability of Eventsig. Define s1 := Pr [ Eventsig ]. Let σ be the signature in tx, and σjj be the signature in the first pour transaction txjj ∈ T that contains pkjsjig. When Eventsig occurs, since pksig = pkjsjig, the two signatures are with respect to the same public key. Moreover, since tx is valid, sig(pksig, m, σ) = 1 where m is everything in tx but for σ. Let mjj consist of all elements in txjj but for σjj. Observe that whenever tx = txjj we also have (m, σ)

ƒ

V

= (mjj, σjj). We use this fact below to show that forges a signature with non-negligible probability.

ƒ A

First, we argue that, conditioned on Eventsig, tx = txjj with overwhelming probability; we do so by way of contradiction. First, since wins, by definition there is txj such that tx = txj and yet each of tx and txj share one serial number. Therefore: (i) tx = txj; and (ii) if tx = txjj then txjj and txj also share a serial number. However the probability that txj and txjj share a serial number is bounded by the probability p˜ that contains two transactions that share the same serial number. Because each serial number is computed as PRFsn (ρ), where ρ is random, p˜ is negligible. We conclude that tx ƒ= txjj with all but negligible probability.

ƒ

T

ask

A ∈ T ƒ

ƒ

Next, we describe an algorithm B, which uses A as a subroutine, that wins the SUF-1CMA game against Sig with probability s1/qP. After receiving a verification key pkjsjig from the SUF-1CMA challenger, the algorithm B performs the following steps.

    1. B selects a random index j ← {1, . . . , qP}.
    2. B conducts the TR-NM experiment with A, except that, when A issues the j-th Pour query, B executes Pour as usual, but modifies the resulting pour transaction txjj as follows: (i) it substitutes pkjsjig for the signature public key in txjj; (ii) it queries the SUF-1CMA challenger to obtain σjj on the appropriate message mjj; and (iii) it substitutes σjj for the signature in txjj.
    3. When A outputs tx, B looks into tx to obtain pksig, m, and σ.
    4. If pksig ƒ= pkjsjig then B aborts; otherwise B outputs (m, σ) as a forgery for Sig.

jj

Note that tx has the same distribution has an “untampered” pour transaction; thus, all transactions

returned to A are distributed as in the TR-NM experiment. Since the index j is selected at random, B succeeds in the experiment with probability at least s1/qP. Because Sig is SUF-1CMA, s1 must be negligible in λ.

Bounding the probability of Eventcol. Define s2 := Pr [ Eventcol ]. When Eventcol occurs,

A receives a transaction txj containing a public key pkjsjig, and subsequently outputs a transaction

tx containing a public key pksig such that (i) pksig pkjsjig, but (ii) CRH(pksig) = CRH(pkjsig). In

particular, A finds collisions for CRH with probability s2. Because CRH is collision resistant, s2

must be negligible in λ.

Bounding the probability of Eventmac. Define s3 := Pr [ Eventmac ]. We first define an exper- iment 31, which modifies the TR-NM experiment as follows. When C samples pp ← Setup(1λ), the sub-call to (pkPOUR, vkPOUR) ← KeyGen(1λ, CPOUR) is replaced by (pkPOUR, vkPOUR, trap) ← Sim(1λ, CPOUR), so to obtain the zero-knowledge trapdoor trap. Afterwards, each time A issues a Pour query, C replaces the zk-SNARK proof in the resulting pour transaction with a simulated proof, obtained by running Sim(trap, x) for an appropriate input x. Because the zk-SNARK is perfect zero knowledge, Pr [ Eventmac ] = s3 in the 31 experiment as well.

Assume by way of contradiction that s3 is non-negligible. We now show how to construct an attacker B, which uses A as a subroutine, that distinguishes PRF from a random function RAND with non-negligible probability. The algorithm B, which has access either to O = PRF or O = RAND, “interfaces” between A and C in the experiment 31 above, as follows.

  1. First, B selects a random index j ← {1, . . . , qCA}, which identifies ask,j ∈ QCA.
  2. Next, B uses the oracle O instead of PRFask,j , i.e., anytime a value needs to be computed depending on PRFask,j (z), for some z, O(z) is used instead. (For instance, the public address key apk,j is one such value.)
  3. Finally, after A outputs tx:
    1. if O has been previously evaluated the expression “PRFpk (ihSig)” using O, B aborts

ask,j

and outputs 1;

    1. otherwise, B evaluates the expression “PRFpk (ihSig)” by using O; if the result equals

ask,j

hi, B outputs 1, else it outputs 0.

Conducting the above strategy does not require knowledge of ask,j because, having the simulation

trapdoor, B does not ne.ed Σwitnesses to genΣ erate Σ(valid) zk-SNARΣK. proofs.

We now argue that .Pr BPRF (1λ) = 1 − Pr BRAND(1λ) = 1 . is non-negligible.

  • Case 1: O = RAND. Observe that:

Pr Σ BRAND(1λ) = 1 | BRAND(1λ) does not abort Σ = 2ω .

where ω is the output length of PRF. Hence:

Pr Σ BRAND(1λ) = 1 Σ = .1 − Pr Σ BRAND(1λ) aborts ΣΣ · 2ω + Pr Σ BRAND(1λ) aborts Σ.

  • Case 2: O = PRF. In this case the distribution of the simulation is identical to that of 31, and B has set ask,j equal to the seed used by O. Recall that, when Eventmac holds, hi = PRFpk(ihSig) for some a ∈ QCA. Since A’s view of the experiment is independent of j, the probability that

a

a = ask,j is at least 1/qCA, and the probability that hi = PRFpk (ihSig) is at least s3/qCA.

ask,j

Hence:

Thus:

Pr Σ BPRF (1λ) = 1 | BPRF (1λ) does not abort Σ = s3/qCA .

Pr Σ BPRF (1λ) = 1 Σ = .1 − Pr Σ BPRF (1λ) aborts ΣΣ · s3/qCA + Pr Σ BPRF (1λ) aborts Σ.

thaΣt .Pr BPRF (1λ) = 1Σ − Pr ΣBRAND(1λ) = 1 . Σis non-negligible, it suffices to show that each of

Clear.ly, Σ2ω is negligibΣle; moΣreover, if s3 is nΣo.n-negligible, then so is |s3/qCA|. Thus, to show

Pr

BRAND(1λ) aborts

and Pr

BPRF (1λ) aborts

is negligible.

To do so, recall that B aborts if and only if it has previously evaluated the expression

pk

“PRF

ask,j

(ihSig)” using O prior to receiving A’s output. First note that B’s only calls to O occur

when it evaluates the functions PRFaddr, PRFsn and PRFpk. Moreover, due to the construction

of these functions it is not possible to evaluate the expression PRFpk (ihSig) using any calls to

ask,j

PRFaddr or PRFsn. Thus B aborts if and only if it has previously queried PRFpk on the expression

pk

PRF

ask,j

(ihSig). However it is easy to see that this cannot happen under the conditions of Eventmac,

since such a query would imply the condition Eventsig or Eventcol, each of which is excluded by

Eventmac. Hence the probability of either condition occurring is 0.

Bounding the probability of Eventkey. Define s4 := Pr [ Eventkey ], and let E be the zk-SNARK extractor for A. Assume by way of contradiction that s4 is non-negligible. We construct an algorithm

sn

sn

B that finds collisions for PRF with non-negligible probability (contradicting the fact that PRF

is collision resistant). The algorithm B works as follows.

  1. Run A (simulating its interaction with the challenger C) to obtain tx.
  2. Run (pkPOUR, vkPOUR) to obtain a witness a for the zk-SNARK proof πPOUR in tx.

E

  1. If a is not a valid witness for the instance x := (rt, snold, snold, cmnew, cmnew, vpub, hSig, h1, h2),

abort and output 0.

1 2 1 2

  1. Parse a as (path1, path2, cold, cold, addrold , addrold , cnew, cnew).

1 2 sk,1 sk,2 1 2

  1. For each i ∈ {1, 2}, parse cold as (addrold , vold, ρold, rold, sold, cmold).

i

pk,i

i

i

i

i

i

  1. For each i ∈ {1, 2}, parse addrold as (aold , skold ).

sk,i

sk,i

enc,i

(Note that, since a is a valid witness, snold = PRFsn

a

(ρold) for all i ∈ {1, 2}.)

  1. For each i ∈ {1, 2}:

i old i

sk,i

  1. Look for a pour transaction tx that contains snold.

i

∈ T

  1. If one tx is found, let ask and ρ be the seed and input used to compute snold in tx; thus,

sni = PRFask (ρ). If ask,i ƒ= ask, output

Note that, whenever Eventkey holds:

(ask,i, ρi ), (ask, ρ)

as a collision for PRF

.

old sn

old

. old

old Σ i sn

  • the proof πPOUR is valid and, with all but negligible probability, the witness a is valid;
  • the serial number snold or snold appears in some previous pour transaction in T ;

1 2

pk pk

  • whenever a is valid, it holds that h1 = PRF old (hSig) and h2 = PRF old (hSig), so that it cannot

ask,1

ask,2

be that aold

sk,1

old sk,2

= ask (as this contradicts the conditions of the event Eventkey).

Overall, we conclude that B finds a collision for PRFsn with probability s4 − negl(λ).

= a

Proof of balance

Define s := AdvBAL(λ); our goal is to show that s is negligible in λ. Recall that ADDR is the set of

Π,A

addresses returned by A’s CreateAddress queries.

Augmenting the ledger with witnesses. We modify the BAL experiment in a way that does not affect A’s view: the challenger C computes, for each pour transaction txPour on the ledger L (maintained by the oracle ODAP), a witness a = (path1, path2, cold, cold, addrold , addrold , cnew, cnew)

1

2

sk,1

sk,2

1

2

for the zk-SNARK instance x = (rt, snold, snold, cmnew, cmnew, vpub, hSig, h1, h2) corresponding to

1 2 1 2

txPour.29 In this way, C obtains an augmented ledger (L, ˙a), where ai is a witness for the zk-SNARK instance xi of the i-th pour transaction in L. Note that we can parse (L, ˙a) as a list of matched pairs (txPour, a) where txPour is a pour transaction in L and a is its corresponding witness.

The discussion below is relative to the above modification of the BAL experiment.

Balanced ledgers. We say that an augmented ledger (L, ˙a) is balanced if the following holds.

  1. Each (txPour, a) in (L, ˙a) contains openings (i.e., decommitments) of two distinct coin com- mitments cmold and cmold; also, each cmold is the output coin commitment of a pour or mint

1 2 i

transaction that precedes txPour on L.

  1. No two ( txPour, a) and (aj, txjPour) in (L, ˙a) contain openings of the same coin commitment.

29 Concretely, for pour transactions in L not inserted by A, C simply retains the witness a internally used by ODAP to generate the transaction. As for the (valid) pour transactions inserted by A, C uses the zk-SNARK multi-instance knowledge extractor corresponding to A; see Section 2.1. (If knowledge extraction fails, C aborts and outputs 1. However, this only happens with negligible probability.)

  1. Each (txPour, a) in (L, ˙a) contains openings of cmold, cmold, cmnew, cmnew to values vold, vold,

1 2 1 2 1 2

vnew, vnew (respectively), with the condition that vold + vold = vnew + vnew + vpub.

1 2 1 2 1 2

  1. For each ( txPour, a) in (L, ˙a) and for each i ∈ {1, 2}, the following conditions hold:
    1. If cmold is also the output of a mint transaction txMint on L, then the public value v in

i

txMint is equal to vold.

i

    1. If cmold is also the output of a pour transaction txj on L, then its witness aj contains

i Pour

an opening of cmold to a value vj that is equal to vold.

i i

  1. For each (txPour, a) in (L, ˙a), where txPour was inserted by , it holds that, for each i 1, 2 , if cmold is the output of an earlier mint or pour transaction txj, then the public address of the i-th output of txj is not contained in ADDR.

i

A ∈ { }

Intuitively, the above conditions ensure that, in L, A did not spend money that was not previously minted, or paid to an address under A’s control. Concretely, one can prove by induction that if (L, ˙a) is balanced then vUnspent + vBasecoin + vA→ADDR > vMint + vADDR→A.

In light of the above, it suffices to argue that the augmented ledger induced by the (modified) BAL experiment is balanced with all but negligible probability. Suppose, by way of contradiction, that is is not the case: A induces, with non-negligible probability, an augmented ledger (L, ˙a) that is not balanced. We distinguish between five cases, corresponding to which one of the above conditions does not hold with non-negligible probability. In each case, we show how to reach a contradiction, concluding the proof.

A violates Condition I. Suppose that Pr [ A wins but violates Condition I ] is non-negligible. By construction of ODAP, every (txPour, a) in (L, ˙a) for which txPour was not inserted by A satisfies Condition I; thus, the violation can only originate from a pair (txPour, a) in (L, ˙a) for which txPour was inserted by A and such that: (i) cmold = cmold; or (ii) there is i ∈ {1, 2} such that cmold has no

1

2

i

corresponding output coin commitment in any pour or mint transaction that precedes txPour on L. Observe that the validity of txPour implies that:

  • The two serial numbers snold and snold are distinct. Moreover, recalling that each snold equals

1

2

i

PRFsn (ρold), this also implies that (aold , ρold) ƒ= (aold , ρold).

a

old sk,i

i

sk,1

1

sk,2

2

  • The witness a contains two valid authentication paths path1, path2 for a Merkle tree constructed using only coin commitments of transactions preceding txPour in L.

In either (i) or (ii), we reach a contradiction. Indeed:

  1. If cmold = cmold, then the fact that snold ƒ= snold implies that the witness a contains two

1

2

1

2

distinct openings of cmold (the first opening contains (aold , ρold), while the second opening

1 sk,1 1

contains (aold , ρold)). This violates the binding property of the commitment scheme COMM.

sk,2 2

  1. If there is i ∈ {1, 2} such that cmold does not previously appear in L, then pathi is an invalid authentication path, and thus yields a collision in the function CRH. This violates the collision resistance of CRH.

i

A violates Condition II. Suppose that Pr [ A wins but violates Condition II ] is non-negligible. Observe that, when Condition II is violated, L contains two pour transactions txPour, txjPour spending the same coin commitment cm, and revealing two serial numbers sn and snj. Since txPour, txjPour are valid, it must be the case that sn = snj. However (as argued already above), if both transactions spend cm but produce different serial numbers, then the corresponding witnesses a, aj contain different openings of cm. This contradicts the binding property of the commitment scheme COMM.

ƒ

A violates Condition III. Suppose that Pr [ A wins but violates Condition III ] is non-negligible. In this case, the contradiction is immediate: whenever Condition III is violated, the equation

vold + vold = vnew + vnew + vpub does not hold, and thus, by construction of the statement POUR, the

1 2 1 2

soundness of the zk-SNARK is violated as well.

A violates Condition IV. Suppose that Pr [ A wins but violates Condition IV ] is non-negligible. Observe that, when Condition IV is violated, L contains:

  • a pour transaction txPour in which a coin commitment cmold is opened to a value vold; and also
  • a (mint or pour) transaction txj that opens cmold to a value vj different from vold. This contradicts the binding property of the commitment scheme COMM.

A violates Condition V. Suppose that Pr [ A wins but violates Condition V ] is non-negligible. Observe that, when Condition V is violated, L contains an inserted pour transaction txPour that spends the output of a previous transaction txj whose public address addrpk = (apk, pkenc) lies in ADDR; moreover, the witness associated to txj contains ask such that apk = PRFaddr(0). We omit the full argument, but one can verify that, in this case, we can construct a new adversary B that uses A to distinguish, with non-negligible probability, PRF from a random function.

ask

References

[BB04] Dan Boneh and Xavier Boyen. Secure identity based encryption without random oracles. In Proceedings of the 24th Annual International Cryptology Conference, CRYPTO ’04, pages 443–459, 2004.

[BBDP01] Mihir Bellare, Alexandra Boldyreva, Anand Desai, and David Pointcheval. Key-privacy in public- key encryption. In Proceedings of the 7th International Conference on the Theory and Application of Cryptology and Information Security, ASIACRYPT ’01, pages 566–582, 2001.

[BBSU12] Simon Barber, Xavier Boyen, Elaine Shi, and Ersin Uzun. Bitter to better – how to make Bitcoin a better currency. In Proceedings of the 16th International Conference on Financial Cryptography and Data Security, FC ’12, pages 399–414, 2012.

[BCCT12] Nir Bitansky, Ran Canetti, Alessandro Chiesa, and Eran Tromer. From extractable collision resistance to succinct non-interactive arguments of knowledge, and back again. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, ITCS ’12, pages 326–349, 2012.

[BCCT13] Nir Bitansky, Ran Canetti, Alessandro Chiesa, and Eran Tromer. Recursive composition and bootstrapping for SNARKs and proof-carrying data. In Proceedings of the 45th ACM Symposium on the Theory of Computing, STOC ’13, pages 111–120, 2013.

[BCG+13] Eli Ben-Sasson, Alessandro Chiesa, Daniel Genkin, Eran Tromer, and Madars Virza. SNARKs for C: verifying program executions succinctly and in zero knowledge. In Proceedings of the 33rd Annual International Cryptology Conference, CRYPTO ’13, pages 90–108, 2013.

[BCGT13a] Eli Ben-Sasson, Alessandro Chiesa, Daniel Genkin, and Eran Tromer. Fast reductions from RAMs to delegatable succinct constraint satisfaction problems. In Proceedings of the 4th Innovations in Theoretical Computer Science Conference, ITCS ’13, pages 401–414, 2013.

[BCGT13b] Eli Ben-Sasson, Alessandro Chiesa, Daniel Genkin, and Eran Tromer. On the concrete efficiency of probabilistically-checkable proofs. In Proceedings of the 45th ACM Symposium on the Theory of Computing, STOC ’13, pages 585–594, 2013.

[BCI+13] Nir Bitansky, Alessandro Chiesa, Yuval Ishai, Rafail Ostrovsky, and Omer Paneth. Succinct non- interactive arguments via linear interactive proofs. In Proceedings of the 10th Theory of Cryptography Conference, TCC ’13, pages 315–333, 2013.

[BCTV14] Eli Ben-Sasson, Alessandro Chiesa, Eran Tromer, and Madars Virza. Succinct non-interactive zero knowledge for a von Neumann architecture. In Proceedings of the 23rd USENIX Security Symposium, Security ’14, pages ???–???, 2014. Available at http://eprint.iacr.org/2013/879.

[Bel06] Mihir Bellare. New proofs for NMAC and HMAC: security without collision-resistance. In Proceedings of the 26th Annual International Conference on Advances in Cryptology, CRYPTO ’06, pages 602–619, 2006.

[Ben13] Eli Ben-Sasson. Universal and affordable computational integrity, May 2013. Bitcoin 2013: The Future of Payments. URL: http://www.youtube.com/watch?v=YRcPReUpkcU&feature=youtu.be&t=26m6s.

[BFLS91] La´szlo´ Babai, Lance Fortnow, Leonid A. Levin, and Mario Szegedy. Checking computations in polyloga- rithmic time. In Proceedings of the 23rd Annual ACM Symposium on Theory of Computing, STOC ’91, pages 21–32, 1991.

[BGH+05] Eli Ben-Sasson, Oded Goldreich, Prahladh Harsha, Madhu Sudan, and Salil Vadhan. Short PCPs verifiable in polylogarithmic time. In Proceedings of the 20th Annual IEEE Conference on Computational Complexity, CCC ’05, pages 120–134, 2005.

[Cer00] Certicom Research. SEC 1: Elliptic curve cryptography, 2000. URL: http://www.secg.org/collateral/ sec1_final.pdf.

[Cha82] David Chaum. Blind signatures for untraceable payments. In Proceedings of the 2nd Annual International Cryptology Conference, CRYPTO ’82, pages 199–203, 1982.

[CHL05] Jan Camenisch, Susan Hohenberger, and Anna Lysyanskaya. Compact e-cash. In Proceedings of the 24th Annual International Conference on Theory and Applications of Cryptographic Techniques, EUROCRYPT ’05, pages 302–321, 2005.

[CL01] Jan Camenisch and Anna Lysyanskaya. An efficient system for non-transferable anonymous credentials with optional anonymity revocation. In Proceedings of the 20th Annual International Conference on Theory and Application of Cryptographic Techniques, EUROCRYPT ’01, pages 93–118, 2001.

[DDM03] George Danezis, Roger Dingledine, and Nick Mathewson. Mixminion: Design of a type III anonymous remailer protocol. In Proceedings of the 2003 IEEE Symposium on Security and Privacy, SP ’03, pages 2–15, 2003.

[DFKP13] George Danezis, Cedric Fournet, Markulf Kohlweiss, and Bryan Parno. Pinocchio Coin: building Zerocoin from a succinct pairing-based proof system. In Proceedings of the 2013 Workshop on Language Support for Privacy Enhancing Technologies, PETShop ’13, 2013. URL: http://www0.cs.ucl.ac.uk/staff/G. Danezis/papers/DanezisFournetKohlweissParno13.pdf.

[DMS04] Roger Dingledine, Nick Mathewson, and Paul Syverson. Tor: the second-generation onion router. In

Proceedings of the 13th USENIX Security Symposium, Security ’04, pages 21–21, 2004.

[DW13] Christian Decker and Roger Wattenhofer. Information propagation in the Bitcoin network. In Proceedings of the 13th IEEE International Conference on Peer-to-Peer Computing, P2P ’13, pages 1–10, 2013.

[ES13]Ittay Eyal and Emin Gu¨n Sirer. Majority is not enough: Bitcoin mining is vulnerable, 2013.

[Gen04] Rosario Gennaro. Multi-trapdoor commitments and their applications to proofs of knowledge secure under concurrent man-in-the-middle attacks. In Proceedings of the 24th Annual International Cryptology Conference, CRYPTO ’04, pages 220–236, 2004.

[GGPR13] Rosario Gennaro, Craig Gentry, Bryan Parno, and Mariana Raykova. Quadratic span programs and succinct NIZKs without PCPs. In Proceedings of the 32nd Annual International Conference on Theory and Application of Cryptographic Techniques, EUROCRYPT ’13, pages 626–645, 2013.

[GMR89] Shafi Goldwasser, Silvio Micali, and Charles Rackoff. The knowledge complexity of interactive proof systems. SIAM Journal on Computing, 18(1):186–208, 1989. Preliminary version appeared in STOC ’85.

[GOS06a] Jens Groth, Rafail Ostrovsky, and Amit Sahai. Non-interactive Zaps and new techniques for NIZK. In Proceedings of the 26th Annual International Conference on Advances in Cryptology, CRYPTO ’06, pages 97–111, 2006.

[GOS06b] Jens Groth, Rafail Ostrovsky, and Amit Sahai. Perfect non-interactive zero knowledge for NP. In Proceedings of the 25th Annual International Conference on Advances in Cryptology, EUROCRYPT ’06, pages 339–358, 2006.

[Gro10] Jens Groth. Short pairing-based non-interactive zero-knowledge arguments. In Proceedings of the 16th International Conference on the Theory and Application of Cryptology and Information Security, ASIACRYPT ’10, pages 321–340, 2010.

[GW11] Craig Gentry and Daniel Wichs. Separating succinct non-interactive arguments from all falsifiable assumptions. In Proceedings of the 43rd Annual ACM Symposium on Theory of Computing, STOC ’11, pages 99–108, 2011.

[KL07] Jonathan Katz and Yehuda Lindell. Introduction to Modern Cryptography. Chapman & Hall/CRC, 2007. [Lee13] Timothy B. Lee. Bitcoin needs to scale by a factor of 1000 to compete with Visa. here’s how to do it.

The Washington Post (http://www.washingtonpost.com), November 2013.

[Lip12] Helger Lipmaa. Progression-free sets and sublinear pairing-based non-interactive zero-knowledge argu- ments. In Proceedings of the 9th Theory of Cryptography Conference on Theory of Cryptography, TCC ’12, pages 169–189, 2012.

[Lip13] Helger Lipmaa. Succinct non-interactive zero knowledge arguments from span programs and linear error-correcting codes. In Proceedings of the 19th International Conference on the Theory and Application of Cryptology and Information Security, ASIACRYPT ’13, pages 41–60, 2013.

[Max13] Greg Maxwell. CoinJoin: Bitcoin privacy for the real world, August 2013. Bitcoin Forum. URL:

https://bitcointalk.org/index.php?topic=279249.0.

[MGGR13] Ian Miers, Christina Garman, Matthew Green, and Aviel D. Rubin. Zerocoin: Anonymous distributed e-cash from bitcoin. In Proceedings of the 2013 IEEE Symposium on Security and Privacy, SP ’13, pages 397–411, 2013.

[Mic00] Silvio Micali. Computationally sound proofs. SIAM Journal on Computing, 30(4):1253–1298, 2000.

Preliminary version appeared in FOCS ’94.

[MPJ+13] Sarah Meiklejohn, Marjori Pomarole, Grant Jordan, Kirill Levchenko, Damon McCoy, Geoffrey M. Voelker, and Stefan Savage. A fistful of Bitcoins: Characterizing payments among men with no names. In Proceedings of the 2013 Conference on Internet Measurement Conference, IMC ’13, pages 127–140, 2013.

[Nak09] Satoshi Nakamoto. Bitcoin: a peer-to-peer electronic cash system, 2009. URL: http://www.bitcoin. org/bitcoin.pdf.

[Nat12] National Institute of Standards and Technology. FIPS PUB 180-4: Secure Hash Standard. http:

//csrc.nist.gov/publications/PubsFIPS.html, 2012.

[PGHR13] Bryan Parno, Craig Gentry, Jon Howell, and Mariana Raykova. Pinocchio: nearly practical verifiable computation. In Proceedings of the 34th IEEE Symposium on Security and Privacy, Oakland ’13, pages 238–252, 2013.

[Pol13]PolarSSL. PolarSSL. http://polarssl.org, Oct 2013.

[RM11] Fergal Reid and Harrigan Martin. An analysis of anonymity in the Bitcoin system. In Proceedings of the 3rd IEEE International Conference on Privacy, Security, Risk and Trust and on Social Computing, SocialCom/PASSAT ’11, pages 1318–1326, 2011.

[RS12] Dorit Ron and Adi Shamir. Quantitative analysis of the full Bitcoin transaction graph. Cryptology ePrint Archive, Report 2012/584, 2012.

[ST99] Tomas Sander and Amnon Ta-Shma. Auditable, anonymous electronic cash. In Proceedings of the 19th Annual International Cryptology Conference on Advances in Cryptology, CRYPTO ’99, pages 555–572, 1999.

[Val08] Paul Valiant. Incrementally verifiable computation or proofs of knowledge imply time/space efficiency.

In Proceedings of the 5th Theory of Cryptography Conference, TCC ’08, pages 1–18, 2008.

[Wui14] Pieter Wuille. Proposed BIP for dealing with malleability. Available at https://gist.github.com/ sipa/8907691, 2014.