dhruvbansal / plutus

Code for hosting the bitcoin blockchain on top of TitanDB

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Plutus

Data Model

The raw data for the blockchain is generated using two tools:

  • bitcoind which downloads the blockchain and keeps it synced
  • blockparser which dumps the blockchain data to CSV format

As of approximately 2014-10-12 I have

$ wc -l /data1/blockchain/*
     317530 /data1/blockchain/blocks.csv
  108235459 /data1/blockchain/inputs.csv
  121222652 /data1/blockchain/outputs.csv
   45348658 /data1/blockchain/transactions.csv
  275,124,299 total

Sample data from the output of blockparser (indeed, the first few 1000 blocks & transactions) is included in the data directory. The following command:

$ head -n2 data/*.csv
==> data/blocks.csv <==
ID,Hash,Version,Timestamp,Nonce,Difficulty,Merkle,NumTransactions,OutputValue,FeesValue,Size
0,"000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f",1,"2009-01-03T18:15:05Z",2083236893,1.000000,"4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b",1,5000000000,0,285

==> data/inputs.csv <==
TransactionId,Index,Script,OutputTxHash,OutputTxIndex
171,0,"01091d8d76a82122082246acbb6cc51c839d9012ddaca46048de07ca8eec221518200241cdb85fab4815c6c624d6e932774f3fdf5fa2a1d3a1614951afb83269e1454e2002443047","0437cd7f8525ceed2324359c2d0ba26006d92d856a9c20fa0241106ee5a597c9",0

==> data/outputs.csv <==
TransactionId,Index,Value,Script,ReceivingAddress,InputTxHash,InputTxIndex
0,0,5000000000,"ac5f1df16b2b704c8a578d0bbaf74d385cde12c11ee50455f3c438ef4c3fbcf649b6de611feae06279a60939e028a8d65c10b73071a6f16719274855feb0fd8a670441","1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa",,

==> data/transactions.csv <==
ID,Hash,Version,BlockId,NumInputs,NumOutputs,OutputValue,FeesValue,LockTime,Size
0,"4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b",1,0,0,1,5000000000,0,3652501241,204

reveals the data model used by the blockchain.

Here is a simple diagram of these relationships:

                                 /------- spent -------------\
                                 |                           |
                                 v                           |
  _______                   _____________                 _________
  |BLOCK| <--- included --- |TRANSACTION| --- output ---> |ADDRESS|
  -------                   -------------                 ---------
   |   ^	                  |   ^	          	    
   |   |	  	          |   |	         	  
/--/   \-------\	       /--/   \-------\        
|              |	       |              |                
\--- parent ---/	       \--- input ----/

the data should be loaded in the following order:

  1. blocks
  2. transactions
  3. outputs
  4. inputs

Schema

See the TitanDB schema docs for details:

Vertices

Vertex Label
block
transaction
address

Edges

Edge LabelSource Vertex LabelTarget Vertex LabelMultiplicityReasoning
parentblockblockMANY2ONEEach block only has one parent but a given block can have multiple blocks claiming to descend from it (a blockchain fork)
includedtransactionblockMANY2ONEEvery transaction is included in one block (at least once it’s confirmed?)…
inputtransactiontransactionMULTIA transaction can be an input in only one other transaction…except when it can appear more than once!
outputtransactionaddressMULTIEach transaction can have multiple output addresses and multiple transactions can have the same address as output
spentaddresstransactionMULTIEach address can only be spent in a single transaction…except when it can appear more than once!

Properties

Property KeyProperty Key Data TypeProperty Key CardinalityVertex LabelsEdge LabelsFlagsKey
bkidlongSINGLEblockuniquecomposite
txidlongSINGLEtransactionuniquecomposite
hashstringSINGLEaddressuniquecomposite
bkHashstringSINGLEblockuniquecomposite
txHashstringSINGLEtransactionuniquecomposite
versionstringSINGLEblock, transactioncomposite
outputValuelongSINGLEblock, transactionmixed
feesValuelongSINGLEblock, transactionmixed
sizelongSINGLEblock, transactionmixed
timestamplongSINGLEblockmixed
noncestringSINGLEblockcomposite
difficultyfloatSINGLEblockmixed
merklestringSINGLEblockcomposite
numTxintegerSINGLEblockprecalcmixed
numInputsintegerSINGLEtransactionprecalcmixed
numOutputsintegerSINGLEtransactionprecalcmixed
lockTimelongSINGLEtransactionmixed
balancelongSINGLEaddressprecalcmixed
indexintegerSINGLEinput, outputvertex
scriptstringSINGLEinput, outputcomposite, vertex
outputTxIndexintegerSINGLEinput, spentvertex
valuelongSINGLEoutputmixed, vertex
rcvrAddressHashstringSINGLEoutputcomposite, vertex

About

Code for hosting the bitcoin blockchain on top of TitanDB


Languages

Language:Groovy 60.9%Language:Java 21.8%Language:Ruby 17.3%