Hash and hashing - what is it in simple words?

Hello, dear readers of the Tyulyagin project! In today's article about cryptocurrencies, we will talk about hash and hashing . In the article you will learn what a hash and a hash function are, how hashes are structured in general and how hashing works in cryptocurrencies. In addition, examples of hashes are given and answers to the most popular answers about hashes and hashing are given. More on all this later in the article.

What is a hash?

A hash is a mathematical function that converts an arbitrary-length input into a fixed-length encrypted output. This way, no matter the original amount of data or file size, its unique hash will always be the same size. Moreover, hashes cannot be used to "reverse engineer" input data from hashed output data, since hash functions are "one-way" (like a meat grinder: you can't put ground beef back into a steak). However, if you use such a function on the same data, its hash will be identical, so you can check that the data is the same (i.e. no change) if you already know its hash.

Hashing is also important for blockchain management in cryptocurrency.

What is hash used for?

Great question. However, the answer is not so simple, since crypto hashes are used for a huge number of things.

For you and me, ordinary users, the most common application of hashing is storing passwords. For example, if you have forgotten the password to any online service, you will most likely have to use the password recovery function. In this case, however, you will not receive your old password, since the online service does not actually store user passwords in plain text. Instead, it stores them as hash values. That is, even the service itself cannot know what your password actually looks like. The only exception is when the password is very simple and its hash value is widely known in cracking circles. Thus, if, after using the recovery function, you suddenly received the old password in clear text, you can be sure: the service you are using does not hash user passwords, which is very bad.

You can even do a simple experiment: use a special website to try converting a simple password like “123456” or “password” from its hash values ​​(generated by the MD5 algorithm) back into text. The likelihood that the hash database will contain data about the simple passwords you entered is very high. In my case, the hashes of the words “brain” (8b373710bcf876edd91f281e50ed58ab) and “Brian” (4d236810821e8e83a025f2a83ea31820) were successfully recognized, but the hash of the previous paragraph was not. An excellent example, just for those who still use simple passwords.

Another example, cooler. Not long ago, news spread across thematic sites that the popular cloud service Dropbox blocked one of its users for distributing copyrighted content. The hero of the story immediately wrote about this on Twitter, triggering a wave of indignation among service users who rushed to accuse Dropbox of allegedly allowing itself to view the contents of client accounts, although it does not have the right to do so.

However, there was still no need for this. The fact is that the owner of the copyright-protected content had hash codes of certain audio and video files prohibited for distribution, and added them to the list of blocked hashes. When a user attempted to illegally distribute certain content, Dropbox's automated scanners detected files whose hashes were on the notorious list and blocked the possibility of their distribution.

Where else can hash functions be used besides password storage systems and media file protection? In fact, there are many more tasks where hashing is used than I know, much less can describe in one article. However, there is one special area of ​​application of hashes that is especially close to us as employees of Kaspersky Lab: hashing is widely used to detect malicious programs by security software, including those produced by our company.

How hash functions work

Typical hash functions take variable-length input to return a fixed-length output. A cryptographic hash function combines the message passing capabilities of hash functions with security properties.

Hash functions are commonly used data structures in computing systems for tasks such as message integrity checking and information authentication. Although they are considered cryptographically "weak" because they can be solved in polynomial time, they are not easy to decrypt.

Cryptographic hash functions add security features to typical hash functions, making message content or information about recipients and senders difficult to discover.

Specifically, cryptographic hash functions have these three properties :

  • They are “collision-free”. This means that no two input hashes should map to the same output hash.
  • They can be hidden. It must be difficult to guess the input value for a hash function from its output.
  • They must be puzzle-oriented (be a puzzle). It must be difficult to find an input that provides a predetermined output. Thus, input data should be selected from the widest possible distribution.

Due to the hash's properties, they are widely used in online security - from protecting passwords to detecting data leaks and verifying the integrity of a downloaded file.

Hashing and cryptocurrencies

The basis of cryptocurrency is the blockchain, which is a global distributed ledger formed by linking individual blocks of transaction data. The blockchain contains only confirmed transactions, which prevents fraudulent transactions and double spending of currency. The resulting encrypted value is a series of numbers and letters that are not similar to the original data, and is called a hash. Cryptocurrency mining involves working with this hash.

Hashing requires processing the data from a block using a mathematical function, resulting in a fixed-length output. Using a fixed-length output improves security because anyone trying to decrypt the hash will not be able to tell how long or short the input is just by looking at the length of the output.

Solving a hash starts with the data present in the block header and essentially solves a complex mathematical problem. Each block header contains the version number, timestamp, hash used in the previous block, Merkle root hash, nonce, and target hash.

The miner focuses on the nonce, a string of numbers. This number is added to the hashed content of the previous block, which is then hashed. If this new hash is less than or equal to the target hash, then it is accepted as the solution, the miner is given a reward, and the block is added to the block chain.

The blockchain transaction verification process is based on data encryption using algorithmic hashing.

Introduction to the SHA Family

The SHA-1 algorithm was developed by the US National Security Agency (NSA) and published as a federal standard by the US National Institute of Standards and Technology (NIST) in 1995. NIST-issued cryptographic standards are trusted throughout the world and are generally required on all computers used by the United States government or military. SHA-1 replaced previous weakened hash functions such as MD5.

Over time, several continuous cryptographic attacks on SHA-1 have reduced the effectiveness of the key length. Because of this, in 2002, the NSA and NIST chose SHA-2 as the new recommended hashing standard. This happened long before SHA-1 was considered cracked. In February 2022, a successful hash collision attack was discovered, which rendered SHA-1 useless for protecting an electronic signature.

An excellent discussion of SHA-1 hacking and example documentation can be found here.

Hash Features

Solving the hash requires the miner to determine which string to use as the nonce, which itself requires a significant amount of trial and error. This is because the nonce is a random string. It is unlikely that a miner will successfully find the correct nonce on the first try, meaning that the miner could potentially test a large number of variations of the nonce before getting it right. The higher the complexity—a measure of how difficult it is to create a hash that satisfies the requirements of the target hash—the longer it will likely take to generate a solution.

Hash and Hashing Example

Hashing the word "hello" will produce a result that is the same length as the hash for "I'm going to the store." The function used to generate the hash is deterministic, meaning that it will produce the same result every time the same input is used. It can efficiently generate hashed input, it also makes the input difficult to identify (leading to mining) and also makes small changes to the output of the input into an unrecognizable, completely different hash.

Processing the hash functions required to encrypt new blocks requires significant computer processing power, which can be expensive. To encourage people and companies, called miners, to invest in the necessary technology, cryptocurrency networks reward them with both new cryptocurrency tokens and transaction fees. Miners are only rewarded if they are the first to create a hash that matches the requirements set out in the target hash code.

PKI transition models

The following are scenarios for implementing SHA-2 into PKI components (for these examples, a two-level PKI is used - autonomous root system, online certificate authorities), each of which can be either a new component or a migrated one:

  • Two PKI trees, one all SHA-1, the other all SHA-2.

The remaining scenarios assume one PKI tree:

  • The entire PKI tree, from the root to the endpoints, is SHA-1.
  • The entire PKI tree, from the root to the endpoints, is SHA-2.
  • The root is SHA-1, the issuing CAs are SHA-2, and the endpoint certificates are SHA-2.
  • The root is SHA-1, the issuing CAs are SHA-2, and the endpoint certificates are SHA-1.
  • The root is SHA-1, the issuing CAs are SHA-2 and SHA-1, and the endpoint certificates are SHA-2 and SHA-1.
  • The root is SHA-2, the issuing CAs are SHA-1, and the endpoint certificates are SHA-1.
  • The root is SHA-2, the issuing CAs are SHA-2, and the endpoint certificates are SHA-1.
  • The root is SHA-2, the issuing CAs are SHA-2 and SHA-1, and the endpoint certificates are SHA-2 and SHA-1.

It is also possible to have an issuing CA that switches between SHA-1 and SHA-2 as needed, but this is likely to cause confusion in PKI services (and is not particularly recommended). If possible, to ease the transition, you can run parallel PKIs, one with SHA-1, the other with SHA-2, and then migrate the devices used once testing allows.

Note: The root CA's own CA certificate does not need to be migrated to SHA-2, even if it still uses SHA-1. All legacy SHA-1 checkers take care of everything after the root CA's own certificate (and will, at least for the foreseeable future). However, it makes sense to move everything, including the root CA's own CA certificate, to SHA-2 so that you can say that the entire PKI is SHA-2 and avoid further SHA-1-related changes for the foreseeable future.

Public CAs have already migrated from SHA-1 to SHA-2 for any certificates with a lifetime expiring after January 1, 2022, so you should focus your efforts on servers and applications that have not yet migrated to SHA-2 public digital certificates. Once this issue is resolved, you can start looking at internal PKIs and trusted parties. The transition from SHA-1 to SHA-2 is not technically difficult, but it is a massive logistical change with many implications that requires extensive testing.

It's unlikely that most vendors will know the exact date of death of SHA-1 (i.e., the date when its use in an application or device will result in "fatal" errors), but it will likely happen sooner than you expect as more users switches to SHA-2. So you really should make the switch now.

Popular questions about hash

What is hash and hash function?

Hash functions are mathematical functions that transform or “map” a given set of data into a fixed-size bit string, also known as a “hash” (hash by code, hash by sum, hash value, etc.).

How is hash calculated?

A hash function uses complex mathematical algorithms that convert arbitrary-length data into fixed-length data (for example, 256 characters). If you change one bit anywhere in the original data, the entire hash value changes, making it useful for checking the accuracy of digital files and other data.

What are hashes used for in blockchains?

Hashes are used in several parts of the blockchain system. First, each block contains a hash of the block header of the previous block, ensuring that nothing was changed when new blocks were added. Cryptocurrency mining using proof of work (PoW) also uses hashing of randomly generated numbers to achieve a specific hashed value containing a series of leading zeros. This arbitrary function is resource intensive, making it difficult for an attacker to intercept the network.

SHA-2 family

SHA-2 is a cryptographic hashing standard that software and hardware should be using for at least the next couple of years. SHA-2 is very often called the SHA-2 family of hash functions because it contains many hashes of different sizes, including 224-, 256-, 384- and 512-bit sequences. When someone says they use SHA-2, the length of their hash is unknown, but the most popular one right now is 256-bit. Although SHA-2 has some of the same mathematical characteristics as SHA-1 and has some minor flaws, it is still considered “strong” in the crypto world. Without a doubt, it is better than SHA-1 and than any critical certificate, application or hardware device that uses SHA-1. Anything that uses SHA-1 should be converted to SHA-2.

Summary

  • A hash is a function that satisfies the encryption requirements needed to solve computations on a blockchain.
  • Hashes have a fixed length because it is almost impossible to guess the length of the hash if someone tried to hack the blockchain.
  • The same data will always produce the same hashed value.
  • Hash, as a nonce or solution, is the basis of the blockchain network.
  • The hash is created based on the information contained in the block header.

And that’s all about hash and hashing today. I hope the article was useful to you. Share the article on social networks and instant messengers and bookmark the site. Good luck and see you again on the pages of the Tyulyagin !

  • 1
    Share
Rating
( 2 ratings, average 4.5 out of 5 )
Did you like the article? Share with friends:
For any suggestions regarding the site: [email protected]
Для любых предложений по сайту: [email protected]