Fatskills
Practice. Master. Repeat.
Study Guide: Blockchain and Web3 Development: Blockchain Fundamentals - Cryptographic Hashing, SHA-256, Keccak-256, Merkle Trees
Source: https://www.fatskills.com/cryptocurrency-bitcoin-blockchain-and-more/chapter/blockchain-and-web3-development-blockchain-and-web3-development-blockchain-fundamentals-cryptographic-hashing-sha256-keccak256-merkle-trees

Blockchain and Web3 Development: Blockchain Fundamentals - Cryptographic Hashing, SHA-256, Keccak-256, Merkle Trees

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~6 min read

What This Is

Cryptographic hashing turns any piece of data into a fixed?size, unique fingerprint. In decentralized systems the hash is the “anchor” that lets everyone agree on the exact state of a contract, a transaction, or a collection of files without trusting a single party. For example, when an NFT is minted on OpenSea the contract stores the Keccak?256 hash of the token’s metadata URI; the hash guarantees the metadata can’t be swapped later, protecting both creator and buyer.


Key Terms & Code Snippets

  • SHA?256: A 256?bit hash function from the SHA?2 family, widely used off?chain (e.g., Bitcoin block headers, IPFS content IDs).
    js const crypto = require('crypto'); const hash = crypto.createHash('sha256').update('hello').digest('hex'); //-"2cf24dba..."

  • Keccak?256 (aka keccak256): The hash algorithm built into the EVM; Solidity’s keccak256() returns a bytes32.
    solidity bytes32 public root = keccak256(abi.encodePacked(msg.sender, block.timestamp));

  • Merkle Tree: A binary tree where each leaf is a data hash and each parent node is the hash of its two children. The top?most hash (the Merkle root) summarizes the entire dataset.

  • Merkle Proof: A short array of sibling hashes that lets a verifier recompute the Merkle root for a single leaf without downloading the whole tree.
    js // ethers.js example const proof = await contract.getProof(tokenId); // bytes32[] const isValid = await contract.verifyProof(proof, leafHash, merkleRoot);

  • Leaf Hash: The hash of the raw data that sits at the bottom of a Merkle tree (e.g., an address whitelisted for a token sale).
    solidity bytes32 leaf = keccak256(abi.encodePacked(whitelistedAddress));

  • Merkle Root (bytes32 merkleRoot): Stored in a contract; all off?chain participants can verify inclusion proofs against this single value.

  • abi.encodePacked vs abi.encode: encodePacked concatenates arguments tightly (good for hashing) but can cause collisions; encode adds length prefixes (safer for complex types).

  • bytes32 vs bytes: bytes32 is a fixed?size 32?byte value (ideal for hashes); bytes is dynamic and costs more gas when stored.

  • Gas?Optimized Hashing: Use keccak256(abi.encodePacked(...)) only when you’re sure the argument types can’t collide; otherwise prefer keccak256(abi.encode(...)).

  • Off?Chain Verification: You can compute a Merkle root in JavaScript and submit only the root on?chain, saving massive gas.

  • require(leaf == keccak256(...)): A common pattern to assert that a caller’s data matches a pre?computed leaf hash before accepting a proof.

  • mapping(bytes32 => bool) claimed;: Stores whether a particular leaf (e.g., airdrop claim) has already been used, preventing double?spends.


Step?by?Step / Process Flow

  1. Generate the data set & build the Merkle tree off?chain
    js const { MerkleTree } = require('merkletreejs'); const leaves = addresses.map(a => keccak256(a)); const tree = new MerkleTree(leaves, keccak256, { sortPairs: true }); const root = tree.getRoot().toString('hex');

  2. Deploy a Solidity contract that stores the root
    solidity contract Airdrop { bytes32 public immutable merkleRoot; mapping(address => bool) public claimed; constructor(bytes32 _root) { merkleRoot = _root; } }

  3. Write a claim function that verifies a proof
    solidity function claim(uint256 amount, bytes32[] calldata proof) external { require(!claimed[msg.sender], "already claimed"); bytes32 leaf = keccak256(abi.encodePacked(msg.sender, amount)); require(MerkleProof.verify(proof, merkleRoot, leaf), "invalid proof"); claimed[msg.sender] = true; // transfer tokens / mint NFT … }

  4. Compile & test with Hardhat
    bash npx hardhat compile npx hardhat test # includes a test that builds a proof and calls claim()

  5. Deploy to a testnet (e.g., Goerli) via a script
    js const Airdrop = await ethers.getContractFactory("Airdrop"); const airdrop = await Airdrop.deploy(root); await airdrop.deployed(); console.log("Deployed at:", airdrop.address);

  6. Interact from the front?end using Ethers.js
    js const proof = tree.getProof(keccak256(userAddress)).map(x => x.data); await airdrop.claim(amount, proof);


Common Mistakes

  • Mistake: Using keccak256(abi.encodePacked(address, uint256)) with two dynamic types that can collide.
    Correction: Prefer keccak256(abi.encode(address, uint256)) or add a separator constant; it prevents hash collisions that could let an attacker claim another’s leaf.

  • Mistake: Storing the entire Merkle tree on?chain to “prove” inclusion.
    Correction: Only store the root; the proof is supplied by the caller. This saves >90?% gas and keeps the contract size tiny.

  • Mistake: Forgetting to mark a leaf as claimed, allowing double?spend of an airdrop.
    Correction: Use a mapping(bytes32 => bool) (or address => bool) and set it before transferring assets to avoid re?entrancy and replay attacks.

  • Mistake: Mixing up SHA?256 and keccak256 when generating proofs off?chain.
    Correction: Always use the same hash algorithm on both sides; the EVM only knows keccak256. If you need SHA?256, compute it off?chain and store the result as a plain bytes32.

  • Mistake: Assuming a Merkle proof is immutable; an attacker can replace a leaf if the root is recomputed.
    Correction: The root must be immutable after deployment (e.g., immutable variable) or updated only via a governance?controlled function with proper timelocks.


Blockchain Developer Interview / Practical Insights

  1. “Explain why a Merkle root is cheaper than storing an array of addresses.”
    Interviewers expect you to discuss gas costs: each 32?byte slot costs 20?k gas to store; a root is a single slot versus n slots for an array.

  2. “How would you verify a Merkle proof in Solidity without using OpenZeppelin’s library?”
    Show the loop that recomputes the hash, handling the order of sibling nodes (if (leaf < sibling) leaf = keccak256(abi.encodePacked(leaf, sibling)); else …).

  3. “What are the security implications of using tx.origin in a whitelist check?”
    Auditors look for the classic phishing?style attack where a user is tricked into calling a malicious contract that then calls your contract; tx.origin would incorrectly grant permission.

  4. “Distinguish between a Merkle tree and a Merkle?Patricia trie (the structure the EVM uses for state).”
    Highlight that a Merkle?Patricia trie is a key?value store with hex?nibble branching, optimized for sparse data, while a classic Merkle tree is a binary hash aggregation used for batch verification.


Quick Check Questions

  1. Scenario: A contract stores bytes32 public root; and a user submits a proof that fails MerkleProof.verify.
    Answer: The transaction reverts because the proof does not reconstruct the stored root, meaning the leaf is not part of the original dataset.

  2. Scenario: You accidentally used keccak256(abi.encodePacked(address, uint256)) for a whitelist where two different users could produce the same hash.
    Answer: The collision lets an attacker claim another user’s allocation; the fix is to use abi.encode or prepend a constant to each field.

  3. Scenario: A DeFi protocol uses SHA?256 to hash user signatures before verifying them on?chain.
    Answer: It will always fail because the EVM only computes Keccak?256; you must either pre?hash with Keccak?256 or verify the signature off?chain.


Last?Minute Cram Sheet (10 one?liners)

  1. Never use tx.origin for auth – it can be hijacked through a malicious contract.
  2. Keccak?256 = the EVM’s native hash; always the go?to for on?chain data integrity.
  3. SHA?256 is off?chain only; if you need it on?chain, wrap it in a pre?computed bytes32.
  4. Merkle root = single bytes32 stored on?chain; all proofs are verified against it.
  5. Proof size-log2(N) hashes; a 1?M?leaf tree needs only ~20?bytes32 entries.
  6. abi.encodePacked is cheap but can collide – use it only with fixed?size types.
  7. immutable variables cost less gas than regular storage when set once in the constructor.
  8. OpenZeppelin’s MerkleProof library is battle?tested; re?implement only if you have a strong reason.
  9. Gas tip: Store the root in a bytes32 constant if it never changes; saves ~4?k gas per call.
  10. Security trap: Forgetting to mark a leaf as claimed opens a replay attack; always update state before external calls.