Guide
Security
Encryption Cluster

Complete Guide to Encryption, Hashing & Encoding

The authoritative reference for understanding encryption, hashing, and encoding — what each one does, when to use it, and how they interact in real systems.

TL;DR — Key Points
  • Encryptiontransforms data so only authorized parties can read it. Reversible with the correct key. Provides confidentiality.
  • Hashingproduces a fixed-size fingerprint of data. One-way, irreversible. Provides integrity verification and secure password storage.
  • Encodingconverts data to a compatible format for transport or storage. Fully reversible by anyone. Provides zero security.
  • HMACa keyed hash that proves both integrity and authenticity. Required when a plain hash is not enough.
  • Authenticated Encryptionencryption that also detects tampering (e.g., AES-GCM). Use this instead of plain encryption in almost all cases.
  • Rule of thumbneed secrecy: encrypt. Need integrity: hash or HMAC. Need format conversion: encode. Never confuse encoding with security.

Looking for tools? See the Encryption Tools Hub — all tools, guides, and comparisons in one place.

What is Encryption?

Encryption is a reversible transformation of data using a cryptographic key. The original plaintext is converted into ciphertext that is unintelligible without the corresponding key. Only parties in possession of the correct key can decrypt and recover the original data. Encryption provides confidentiality — it does not inherently guarantee integrity or authenticity.

There are two categories of encryption algorithms. Symmetric encryption uses the same key to encrypt and decrypt. It is fast and suited for encrypting large volumes of data — files, database fields, network traffic, disk volumes. The primary challenge is securely distributing and storing the shared key. AES (Advanced Encryption Standard) is the dominant symmetric algorithm and is considered the current standard for general-purpose encryption.

Asymmetric encryption uses a mathematically linked key pair: a public key that can be shared openly, and a private key that must remain secret. Data encrypted with the public key can only be decrypted with the private key. This solves the key distribution problem — you can share your public key with anyone without compromising security. RSA and elliptic-curve algorithms (ECDH, X25519) are common asymmetric systems. Asymmetric encryption is significantly slower than symmetric and cannot efficiently encrypt large data directly.

Most production systems use hybrid encryption: asymmetric cryptography negotiates or transfers a session key, and symmetric encryption (AES) handles the actual data. TLS, PGP, and Signal all use this pattern. For a detailed comparison, see the AES vs RSA comparison.

What is Hashing?

Hashing is a one-way transformation that converts data of any size into a fixed-size output called a digest or hash. It is computationally infeasible to reverse a hash to recover the original input. A cryptographic hash function also ensures that any change to the input — even a single bit — produces a completely different hash, making it reliable for detecting modifications.

SHA-256 and SHA-3 are the current standard general-purpose hash functions. MD5 and SHA-1 have known collision vulnerabilities and must not be used for security-critical applications — see the MD5 vs SHA-256 comparison for details. For password storage, use purpose-built slow-hashing algorithms: bcrypt, scrypt, or Argon2. These are intentionally slow to make brute-force attacks impractical, and they incorporate salting to prevent precomputed attacks (rainbow tables).

Plain hashing proves integrity — that data has not changed — but it proves nothing about who produced the hash. Anyone can compute a SHA-256 hash. When authenticity matters alongside integrity, use HMAC. For a full treatment of hashing and HMAC, see the Hashing and HMAC guide.

What is Encoding?

Encoding converts data into a different representation to satisfy compatibility or transport requirements. It uses no secret and requires no key — any system can decode encoded data immediately using publicly documented rules. Encoding provides no confidentiality, no integrity guarantee, and no authentication. It is a formatting mechanism, not a security mechanism.

Base64 encodes binary data as printable ASCII characters, making it safe to embed in JSON, XML, email bodies, and HTML data URIs. URL encoding (percent-encoding) escapes characters that have special meaning in URLs so they can appear in query strings without breaking parsing. Hex encoding represents each byte as two hexadecimal digits, commonly used when displaying hash outputs or debugging binary protocols.

A common and serious mistake is treating Base64 as a security layer. Base64 is trivially decoded. Storing sensitive data as Base64 and calling it "protected" is equivalent to no protection. If you need confidentiality, encrypt first, then encode if transport requires it. For a direct comparison of Base64 and URL encoding, see Base64 vs URL Encoding.

When not to use encoding

  • Do not use encoding when data needs to stay secret — use encryption.
  • Do not use encoding when you need to verify data integrity — use hashing.
  • Do not mistake "it looks garbled" for "it is protected".

Encryption vs Hashing vs Encoding

The three mechanisms are distinct in purpose, reversibility, and security properties. Conflating them is a common source of security errors.

FeatureEncryptionHashingEncoding
Reversible?Yes — with the correct keyNo — one-way onlyYes — by anyone
Uses a key?Yes — symmetric or asymmetricNo (HMAC adds a key)No
Primary purposeConfidentialityIntegrity verification, password storageFormat compatibility
Security levelHigh — depends on key secrecyHigh — depends on algorithm choiceNone
Common algorithmsAES-256, ChaCha20, RSA, ECDHSHA-256, SHA-3, bcrypt, Argon2Base64, URL encoding, Hex
Typical use casesFiles, database fields, TLS, messagingPassword storage, checksums, API signingJSON payloads, URLs, email attachments

Symmetric vs Asymmetric Encryption

Symmetric encryption uses one key for both encryption and decryption. Asymmetric encryption uses a mathematically linked key pair: a public key for encryption and a private key for decryption. Each approach has distinct trade-offs that determine where it belongs in a system design.

PropertySymmetric (e.g., AES)Asymmetric (e.g., RSA)
KeysOne shared secret keyPublic key + private key pair
SpeedVery fastOrders of magnitude slower
Data size limitNone — suitable for large dataLimited — small data only (typically under key size)
Key distributionRequires secure out-of-band sharingPublic key can be shared openly
Primary useBulk data encryptionKey exchange, digital signatures

See the full AES vs RSA comparison for algorithm-level details, or the dedicated symmetric vs asymmetric encryption guide for use-case guidance.

What is HMAC?

HMAC (Hash-based Message Authentication Code) is a construction that combines a cryptographic hash function with a secret key. It produces an authentication tag that proves both that the data is unmodified (integrity) and that it was produced by a party holding the shared secret (authenticity). A plain hash proves only integrity — anyone can compute it.

HMAC is used extensively for API request authentication (AWS Signature V4, Stripe webhooks), cookie signing, and JWT validation with symmetric keys (HS256). The underlying hash algorithm matters: HMAC-SHA256 is the current standard; HMAC-MD5 and HMAC-SHA1 should be avoided for new systems.

For a direct comparison between SHA-256 and HMAC-SHA256 — specifically when each is appropriate — see SHA-256 vs HMAC-SHA256. For a comprehensive treatment of hashing and HMAC together, see the Hashing and HMAC guide.

What is Authenticated Encryption?

Authenticated encryption (AEAD — Authenticated Encryption with Associated Data) combines confidentiality and integrity in a single operation. It encrypts the data and produces an authentication tag. Decryption fails if the ciphertext or associated metadata has been tampered with. This prevents a class of attacks where an adversary modifies ciphertext to manipulate the decrypted output.

AES-GCM (Galois/Counter Mode) is the most widely deployed AEAD cipher. It is hardware-accelerated on modern processors and produces a 128-bit authentication tag. ChaCha20-Poly1305 is an alternative with strong software performance, used in TLS 1.3 and WireGuard. Both are secure choices; AES-GCM is preferred when hardware acceleration is available.

Basic encryption modes like AES-CBC do not provide integrity. An attacker can modify ciphertext in ways that produce predictable plaintext changes (bit flipping, padding oracle attacks). Always prefer authenticated encryption modes. If you must use a non-authenticating mode, add HMAC explicitly.

For complete coverage of AEAD algorithms, nonce requirements, and failure modes, see the Authenticated Encryption and Integrity guide.

Encryption at Rest vs Encryption in Transit

Encryption at rest protects data stored on disk, in databases, or in object storage. Encryption in transit protects data moving across a network between systems or clients. Both are required in most production environments — they address different threat surfaces and neither substitutes for the other.

PropertyAt RestIn Transit
Threat addressedPhysical theft, unauthorized storage accessNetwork interception, man-in-the-middle
Common implementationAES-256 (full-disk, database-level, file-level)TLS 1.3 with AES-GCM or ChaCha20-Poly1305
Key managementKMS, HSM, envelope encryptionTLS certificates, session key negotiation
Does not protect againstApplication-layer breaches (data decrypted to be used)Compromised endpoints; data at rest on servers

How to Decide Which One You Need

Start with your requirement, not the technology. Apply the following logic:

IF

You need to keep data secret from unauthorized parties

Use Encryption. AES-GCM for symmetric; RSA or ECDH for key exchange. Always use authenticated modes.

IF

You need to verify data has not been modified (integrity only)

Use Hashing. SHA-256 for checksums and file verification. bcrypt or Argon2 for passwords.

IF

You need integrity and proof of who produced the data

Use HMAC. HMAC-SHA256 is standard. Both parties must share a secret key. See the Hashing and HMAC guide.

IF

You need confidentiality and integrity together

Use Authenticated Encryption (AES-GCM or ChaCha20-Poly1305). See the Authenticated Encryption guide.

IF

You need to convert data format for transport or system compatibility

Use Encoding. Base64 for binary-to-text; URL encoding for query strings. This provides no security.

When Should You Use Each?

Concrete scenarios with the correct approach for each:

ScenarioCorrect approach
Storing user passwords in a databasebcrypt / Argon2 (hashing)
Encrypting a file for storage in S3AES-256-GCM (authenticated encryption)
Sending a binary image in a JSON API responseBase64 (encoding)
Verifying a webhook payload came from StripeHMAC-SHA256 (authentication)
Protecting network traffic between client and serverTLS 1.3 (encryption in transit)
Verifying a downloaded file is unmodifiedSHA-256 checksum (hashing)
Embedding special characters in a URL parameterURL encoding (percent-encoding)

Frequently Asked Questions

Related Resources