Hashing in Blockchain explained

Intro

Blockchain technology is without a doubt, one of the most defining technological innovations of our times. It has defined how digital transactions are verified and stored through the use of Distributed Ledger Technologies (DLT). But to understand how blockchain works in cryptocurrency, we need to have in mind one basic concept: hashing. If you are new to blockchain and wish to understand how Bitcoin works, for instance, then a grasp of this concept and the related terminologies, will come in handy. In this article, I seek to simplify this, though it’s a bit technical.

What's hashing in blockchain?

Hashing in blockchain refers to the process of having an input item of whatever length reflecting an output item of a fixed length. If we take the example of aa photo editor that can be used to make changes to an image, but once the image is saved and a hash value is generated, any subsequent changes to the image will result in a different hash value. Similarly, in a blockchain system, any change made to a block, such as a transaction, will result in a different hash value. If we take the example of blockchain use in cryptocurrencies, transactions of varying lengths are run through a given hashing algorithm, and all give an output that is of a fixed length. This is regardless of the length of the input transaction. The output is what we call a hash. A good example is Bitcoin’s Secure Hashing Algorithm 256 (commonly shortened to SHA-256). Hashing using SHA-256 always gives an output result of a fixed length, which has a 256-bits length (the output is 32 bytes). This is always the case whether the transaction is just a single word or a complex transaction with huge amounts of data. What this means is that keeping track of a transaction becomes easier when you can recall/trace the hash. The size of the hash will depend on the hash function utilized, but the out using a particular hashing algorithm will be of a specific size.
Example:
When you take a YouTube video of say 50 megabytes and hash it using SHA-256, the output will be a hash of 256-bits in length. Similarly, if you take a text message of 5 kilobytes, the output hash will still be 256-bits. The only difference between the two will be the hash pattern. Let’s summarize the above information as follows: Hashing is the umbrella term for Cryptographic hash functions

Cryptographic hash functions

A hash function, will take any transaction/data input and rehash it to produce an output of a fixed size. The process of using a given hash function to process a transaction is called hashing. The transactional output of that given hash function is what we call a hash. And that should be it. There is more we need to expound on to demystify hashing in blockchain. At this point, I want to emphasize that it is good to remember that the basic characteristic of any given hash function lies in the size of its output. This is what gives us the different hash functions (we will get to that in a moment).

Characteristics of cryptographic hash functions

For a cryptographic hash function to be considered secure, it has to portray certain characteristics or properties. It is these properties that make the hash function suitable for cryptocurrencies like Bitcoin or Ethereum that utilize blockchain technology. Let me explain each one in simple terms for us all.

Deterministic

A hash function needs to have a fixed or specific output. What this means is that it doesn’t matter what number of times you process a given input using a hash function; the result is always of the same length. The hashes will be random and of different patterns, but the same size/length. Why is this important? Imagine getting different results for every transaction you record. It simply means it will be impossible for you to keep track of every input data using the hash

Quick Computation

In blockchain technology, a good hash function would be one that performs quick computations for every data input. It may be difficult to find the input data for a hash, but computing or calculating the hash should be ideally very fast. For instance, you can have the hash result of a simple “hi” within a fraction of a second. Similarly, the hash of a very large file will be received within a fraction of a second.

Pre-image resistance

One of the important properties of secure cryptographic hash functions is they are one-way. Let’s take it this way: given a hash of a particular transaction, it should be virtually impossible or practically infeasible to determine the original input data using this output. This property lends a level of security to the blockchain. When given a particular hash, the only possible way of finding what the original input data is if you hashed all the possible combinations of inputs until you eventually hash the correct or corresponding input. However, because the input data is randomized, hashing it is practically impossible.

Different hashes for every input (Randomized)

Hash functions produce different outputs for every input, even if the input data differs by only a digit or letter. For instance, the hash of the word “Alpha” should be completely different from the hash of the word “Alpha1”. If the patterns were to be similar and differ only at the end, then deciphering them would be easy.

Collision resistant

Cryptographic hash functions are also supposed to have collision resistant properties. Collisions can occur in cases where a hash function gives similar outputs for different inputs. For example, if “pic1” is photo and “pic2” is a video, but a hash function produces the same output, then we call that a collision. Normally, this should not happen. However, it could be a result of a “Birthday Box”.

Cryptographic hash functions

SHA 256: an output of a 256-bit hash and currently in use on the Bitcoin network
Keccak-256: an output of a 256-bit hash; currently in on the Ethereum network

Blockchains and Hashing - where is it used?

Hashing is applied in blockchain as seen in some of the examples used above. Here are more examples.

Addresses on the blockchain are derived from hashing e.g. Bitcoin addresses use SHA2-256 and RIPEMD 160.
Hashing helps in defining cryptographic signatures that help determine valid transactions.
The hash of a transaction makes it easy to keep track of transactions on the blockchain. Instead of looking for a transaction that was the “1030th in block 14573”, it is easier just to copy the hash into a blockchain explorer from where you can view the transaction details.
Hashing functions are crucial in crypto mining here a valid nonce is discovered by computing several hashes. This helps to form a consensus on the blockchain.
The use of “hash of the data” helps to store large amounts of data on the blockchain. This data is time-stamped and can be hashed for future reference. It makes the permanent data storage less bulky or simply more economical.
Hashrate- determining how fast and smoothly-running the mining process is. It is vital in determining difficulty levels during mining.

Conclusion

The cryptographic hash function is an integral part of the blockchain innovation. It is essentially a feature that gives security capabilities to the processed transactions, making them immutable. Hashing is also at the center of “Merkle Trees”, which is an advanced approach to blockchain hashing. It is useful in issues of scalability, and mobile/light wallets.