The problem
How can you trust a data store distributed among an arbitrary number of independent peers? Since each peer has control over its own version of the data, how can you reach a consensus? In a nutshell, these are the main problems blockchain solves.
The core idea revolves around immutability: once written, the data stored in a blockchain can't be modified without consensus between peers and becomes literally impossible to alter as the transaction gets older (because of the computational cost associated with the proof of work, more about it later on).
In layman's terms
Theoretically, you could build a blockchain-like system using only spreadsheets. Each block would be a row and the entire spreadsheet your blockchain. Each row would have a timestamp, some fields where you could store data (e.g. transaction data, account owner, the amount transferred, recipient, etc), a reference to the previous row's hash and the cryptographic hash value of all the fields combined (including the reference field: this puts the chain in the block*chain*).
I'm assuming you're familiar with cryptographic hashing, if not, there are plenty of resources out there explaining it in various levels of detail (starting with Wikipedia). Simply put a cryptographic hash function is a one-way function that always produces the same hash value for the same input in a way that makes it impossible (or extremely difficult) to reverse the process (i.e. to find the original input, given the hash value)
Since all fields get hashed together, any attempt to modify one would automatically result in a different hash value for that specific row.
So, in order to modify our table and keep the blockchain valid we would have to:
- change the row
- recalculate its hash value
- recalculate all the hash values of the subsequent rows since they won't have the correct previous row hash value anymore
This is time-consuming for a human, but nothing for a computer. The hashes can be recalculated in less than a second even for a large blockchain.
This poses a problem. When picking between two conflicting versions of the chain, the one that required the most computational effort (combined among multiple peers) wins. If chains can be easily altered and recreated the CPU time race turns into a bandwidth race slowing down the entire system.
That's why blockchain requires proofs of work: computing a hash value might be fast, however, computing a hash value with a certain constraint (e.g. n
leading zeros: normally an SHA hash looks like this 979c8b5f0598fec69...
, our constraint would require a new field, the nonce, be tweaked until the hash starts with a certain number of zeros, 0000001fb32013...
) requires significantly more time.
The best way to learn is by building
The first thing we need to consider is the block. Like in the spreadsheet example each block has a timestamp, data, and a reference to the previous block. In addition, we have a method to serialize our block (toString
- in a more realistic setup we would also have one to deserialize it), one that calculates the hash based on the current fields, and some setters. Nothing spectacular so far.
class Block {
/**
* Represents a block
* @constructor
* @param {number} timestamp - The timestamp of the block creation
* @param {data} data - Arbitrary data we wish to store in the block
* @param {object} prev - A reference to the previous block, if any
*/
constructor(timestamp, data, prev) {
assert(timestamp)
assert(data)
this.timestamp = timestamp
this.data = data
this.nonce = 0
this.prev = prev
}
/**
* Used to update the nounce in order to get a different hash value.
*/
randomizeNonce() {
this.nonce++
}
/**
* Calculates the hash of the block based on its string representation
*/
calculateHash() {
return crypto
.createHash('sha256')
.update(this.toString())
.digest('hex')
}
/**
* Once we know the required hash, we seal the block by setting its hash
* value.
*/
seal(hash) {
this.hash = hash
}
/**
* Returns the string representation of the block
*/
toString() {
return (this.prev && this.prev.hash || '') + '\n' +
JSON.stringify(this.data) + '\n' +
this.timestamp + '\n' +
this.nonce
}
}
Next, we need to create a class responsible for calculating the block's zero prefixed hash. Doing this requires multiple tries and therefore time.
class Miner {
/**
* Represents the entity that does the work calculating the required hash for each block.
* @constructor
* @param {number} complexity - In our case, the complexity is the requireed number of zeros in front of our hash value. As we increase this value the time and resources needed to compute the hash increases *exponentially*.
*/
constructor(complexity = 1) {
this.complexity = complexity
this.prefix = new Array(this.complexity + 1).join('0')
}
/**
* Mining basically means finding a nonce which hashed together with the rest of the fields, the output hash will be prefixed with a number of zeros equal to the complexity.
*/
mine(block) {
let hash = ''
while(!this.isHashValid(hash)) {
block.randomizeNonce()
hash = block.calculateHash()
}
block.seal(hash)
}
isHashValid(hash) {
return hash.startsWith(this.prefix)
}
}
And finally, the piece that puts everything together. Adding a block to the chain is straightforward enough: we create it using the current timestamp and a reference to our last block, mine its hash then simply replace the last block with our newly created one.
Things get more interesting in the isValid
method. To validate the entire blockchain we need to:
- go through each block
- check the proof of work by ensuring the hash starts with the given number of zeros
- check the hash (there's a small catch here, when we calculate current hash we also include the previous block's hash and thus validating the link between them)
- finally, when we reach the last block we need to make sure it's the same genesis block as the one we created, otherwise the blockchain could be valid, but it might not be our blockchain.
class Blockchain {
/**
* The aggregator that puts everything together
*/
constructor() {
this.genesisBlock = new Block(Date.now(), {})
this.lastBlock = this.genesisBlock
this.miner = new Miner(2)
}
/**
* Adds a new block with the given data
* @constructor
* @param {object} data - Arbitrary data
*/
addBlock(data) {
const block = new Block(Date.now(), data, this.lastBlock)
this.miner.mine(block)
this.lastBlock = block
return block
}
/**
* Checks if the entire blockchain is valid
*/
isValid() {
// we start with the last block and while there
// is a previous block before it
// (in other words, it's not the genesis block),
// we check its hash with the value we compute using
// the current data (possibly tampered) and the previous'
// block hash
let block = this.lastBlock
while (block.prev) {
if (!this.miner.isHashValid(block.hash) ||
block.hash != block.calculateHash()) {
return false
}
block = block.prev
}
// if we reach the last block, we compare it with the
// genesis block to be sure not only that the blockchain is
// valid but also that it's **our** blockchain
if (block.toString() != this.genesisBlock.toString()) {
return false
}
return true
}
}
And there you have it. Your own toy blockchain. Feel free to give it a spin:
const blockchain = new Blockchain()
const quiBlock = blockchain.addBlock({
name: 'Qui-Gon',
order: 'jedi'
})
const obiBlock = blockchain.addBlock({
name: 'Obi-Wan',
order: 'jedi'
})
// the block should be valid
blockchain.isValid()
// tampering with the data
firstBlock.data.order = 'sith'
// the blockchain is not valid anymore
blockchain.isValid()
FAQ
Where does the data (i.e. the actual bytes) lives?
Blockchain data (e.g. the string representation of each block, one after the other) lives on each of the peers which are actively taking part in the blockchain network. There is no central storage authority/entity.What happens if one peer tries to fraudulently modify the blockchain?
To keep the blockchain valid, it would have to modify the hashes of all the blocks following the one that he alters. That would take a lot of time, depending on the size of the blockchain. Even if it manages to do it somehow, since there are usually many other peers in the network with the blockchain unaltered, the majority would win when reaching consensus. Modifying the blockchain requires control over 50% + 1 of the network and a lot of computing power.I don't have a distributed network at my disposal, how can I use blockchain technology?
There are lots of options out there like the Ethereum project that allow you to use their network to store arbitrary data.
Top comments (0)