Complete Guide to Encryption, Hashing & Encoding
The authoritative reference for understanding encryption, hashing, and encoding — what each one does, when to use it, and how they interact in real systems.
- Encryption—transforms data so only authorized parties can read it. Reversible with the correct key. Provides confidentiality.
- Hashing—produces a fixed-size fingerprint of data. One-way, irreversible. Provides integrity verification and secure password storage.
- Encoding—converts data to a compatible format for transport or storage. Fully reversible by anyone. Provides zero security.
- HMAC—a keyed hash that proves both integrity and authenticity. Required when a plain hash is not enough.
- Authenticated Encryption—encryption that also detects tampering (e.g., AES-GCM). Use this instead of plain encryption in almost all cases.
- Rule of thumb—need secrecy: encrypt. Need integrity: hash or HMAC. Need format conversion: encode. Never confuse encoding with security.
Looking for tools? See the Encryption Tools Hub — all tools, guides, and comparisons in one place.
What is Encryption?
Encryption is a reversible transformation of data using a cryptographic key. The original plaintext is converted into ciphertext that is unintelligible without the corresponding key. Only parties in possession of the correct key can decrypt and recover the original data. Encryption provides confidentiality — it does not inherently guarantee integrity or authenticity.
There are two categories of encryption algorithms. Symmetric encryption uses the same key to encrypt and decrypt. It is fast and suited for encrypting large volumes of data — files, database fields, network traffic, disk volumes. The primary challenge is securely distributing and storing the shared key. AES (Advanced Encryption Standard) is the dominant symmetric algorithm and is considered the current standard for general-purpose encryption.
Asymmetric encryption uses a mathematically linked key pair: a public key that can be shared openly, and a private key that must remain secret. Data encrypted with the public key can only be decrypted with the private key. This solves the key distribution problem — you can share your public key with anyone without compromising security. RSA and elliptic-curve algorithms (ECDH, X25519) are common asymmetric systems. Asymmetric encryption is significantly slower than symmetric and cannot efficiently encrypt large data directly.
Most production systems use hybrid encryption: asymmetric cryptography negotiates or transfers a session key, and symmetric encryption (AES) handles the actual data. TLS, PGP, and Signal all use this pattern. For a detailed comparison, see the AES vs RSA comparison.
What is Hashing?
Hashing is a one-way transformation that converts data of any size into a fixed-size output called a digest or hash. It is computationally infeasible to reverse a hash to recover the original input. A cryptographic hash function also ensures that any change to the input — even a single bit — produces a completely different hash, making it reliable for detecting modifications.
SHA-256 and SHA-3 are the current standard general-purpose hash functions. MD5 and SHA-1 have known collision vulnerabilities and must not be used for security-critical applications — see the MD5 vs SHA-256 comparison for details. For password storage, use purpose-built slow-hashing algorithms: bcrypt, scrypt, or Argon2. These are intentionally slow to make brute-force attacks impractical, and they incorporate salting to prevent precomputed attacks (rainbow tables).
Plain hashing proves integrity — that data has not changed — but it proves nothing about who produced the hash. Anyone can compute a SHA-256 hash. When authenticity matters alongside integrity, use HMAC. For a full treatment of hashing and HMAC, see the Hashing and HMAC guide.
What is Encoding?
Encoding converts data into a different representation to satisfy compatibility or transport requirements. It uses no secret and requires no key — any system can decode encoded data immediately using publicly documented rules. Encoding provides no confidentiality, no integrity guarantee, and no authentication. It is a formatting mechanism, not a security mechanism.
Base64 encodes binary data as printable ASCII characters, making it safe to embed in JSON, XML, email bodies, and HTML data URIs. URL encoding (percent-encoding) escapes characters that have special meaning in URLs so they can appear in query strings without breaking parsing. Hex encoding represents each byte as two hexadecimal digits, commonly used when displaying hash outputs or debugging binary protocols.
A common and serious mistake is treating Base64 as a security layer. Base64 is trivially decoded. Storing sensitive data as Base64 and calling it "protected" is equivalent to no protection. If you need confidentiality, encrypt first, then encode if transport requires it. For a direct comparison of Base64 and URL encoding, see Base64 vs URL Encoding.
When not to use encoding
- Do not use encoding when data needs to stay secret — use encryption.
- Do not use encoding when you need to verify data integrity — use hashing.
- Do not mistake "it looks garbled" for "it is protected".
Encryption vs Hashing vs Encoding
The three mechanisms are distinct in purpose, reversibility, and security properties. Conflating them is a common source of security errors.
| Feature | Encryption | Hashing | Encoding |
|---|---|---|---|
| Reversible? | Yes — with the correct key | No — one-way only | Yes — by anyone |
| Uses a key? | Yes — symmetric or asymmetric | No (HMAC adds a key) | No |
| Primary purpose | Confidentiality | Integrity verification, password storage | Format compatibility |
| Security level | High — depends on key secrecy | High — depends on algorithm choice | None |
| Common algorithms | AES-256, ChaCha20, RSA, ECDH | SHA-256, SHA-3, bcrypt, Argon2 | Base64, URL encoding, Hex |
| Typical use cases | Files, database fields, TLS, messaging | Password storage, checksums, API signing | JSON payloads, URLs, email attachments |
Symmetric vs Asymmetric Encryption
Symmetric encryption uses one key for both encryption and decryption. Asymmetric encryption uses a mathematically linked key pair: a public key for encryption and a private key for decryption. Each approach has distinct trade-offs that determine where it belongs in a system design.
| Property | Symmetric (e.g., AES) | Asymmetric (e.g., RSA) |
|---|---|---|
| Keys | One shared secret key | Public key + private key pair |
| Speed | Very fast | Orders of magnitude slower |
| Data size limit | None — suitable for large data | Limited — small data only (typically under key size) |
| Key distribution | Requires secure out-of-band sharing | Public key can be shared openly |
| Primary use | Bulk data encryption | Key exchange, digital signatures |
See the full AES vs RSA comparison for algorithm-level details, or the dedicated symmetric vs asymmetric encryption guide for use-case guidance.
What is HMAC?
HMAC (Hash-based Message Authentication Code) is a construction that combines a cryptographic hash function with a secret key. It produces an authentication tag that proves both that the data is unmodified (integrity) and that it was produced by a party holding the shared secret (authenticity). A plain hash proves only integrity — anyone can compute it.
HMAC is used extensively for API request authentication (AWS Signature V4, Stripe webhooks), cookie signing, and JWT validation with symmetric keys (HS256). The underlying hash algorithm matters: HMAC-SHA256 is the current standard; HMAC-MD5 and HMAC-SHA1 should be avoided for new systems.
For a direct comparison between SHA-256 and HMAC-SHA256 — specifically when each is appropriate — see SHA-256 vs HMAC-SHA256. For a comprehensive treatment of hashing and HMAC together, see the Hashing and HMAC guide.
What is Authenticated Encryption?
Authenticated encryption (AEAD — Authenticated Encryption with Associated Data) combines confidentiality and integrity in a single operation. It encrypts the data and produces an authentication tag. Decryption fails if the ciphertext or associated metadata has been tampered with. This prevents a class of attacks where an adversary modifies ciphertext to manipulate the decrypted output.
AES-GCM (Galois/Counter Mode) is the most widely deployed AEAD cipher. It is hardware-accelerated on modern processors and produces a 128-bit authentication tag. ChaCha20-Poly1305 is an alternative with strong software performance, used in TLS 1.3 and WireGuard. Both are secure choices; AES-GCM is preferred when hardware acceleration is available.
Basic encryption modes like AES-CBC do not provide integrity. An attacker can modify ciphertext in ways that produce predictable plaintext changes (bit flipping, padding oracle attacks). Always prefer authenticated encryption modes. If you must use a non-authenticating mode, add HMAC explicitly.
For complete coverage of AEAD algorithms, nonce requirements, and failure modes, see the Authenticated Encryption and Integrity guide.
Encryption at Rest vs Encryption in Transit
Encryption at rest protects data stored on disk, in databases, or in object storage. Encryption in transit protects data moving across a network between systems or clients. Both are required in most production environments — they address different threat surfaces and neither substitutes for the other.
| Property | At Rest | In Transit |
|---|---|---|
| Threat addressed | Physical theft, unauthorized storage access | Network interception, man-in-the-middle |
| Common implementation | AES-256 (full-disk, database-level, file-level) | TLS 1.3 with AES-GCM or ChaCha20-Poly1305 |
| Key management | KMS, HSM, envelope encryption | TLS certificates, session key negotiation |
| Does not protect against | Application-layer breaches (data decrypted to be used) | Compromised endpoints; data at rest on servers |
How to Decide Which One You Need
Start with your requirement, not the technology. Apply the following logic:
You need to keep data secret from unauthorized parties
Use Encryption. AES-GCM for symmetric; RSA or ECDH for key exchange. Always use authenticated modes.
You need to verify data has not been modified (integrity only)
Use Hashing. SHA-256 for checksums and file verification. bcrypt or Argon2 for passwords.
You need integrity and proof of who produced the data
Use HMAC. HMAC-SHA256 is standard. Both parties must share a secret key. See the Hashing and HMAC guide.
You need confidentiality and integrity together
Use Authenticated Encryption (AES-GCM or ChaCha20-Poly1305). See the Authenticated Encryption guide.
You need to convert data format for transport or system compatibility
Use Encoding. Base64 for binary-to-text; URL encoding for query strings. This provides no security.
When Should You Use Each?
Concrete scenarios with the correct approach for each:
| Scenario | Correct approach |
|---|---|
| Storing user passwords in a database | bcrypt / Argon2 (hashing) |
| Encrypting a file for storage in S3 | AES-256-GCM (authenticated encryption) |
| Sending a binary image in a JSON API response | Base64 (encoding) |
| Verifying a webhook payload came from Stripe | HMAC-SHA256 (authentication) |
| Protecting network traffic between client and server | TLS 1.3 (encryption in transit) |
| Verifying a downloaded file is unmodified | SHA-256 checksum (hashing) |
| Embedding special characters in a URL parameter | URL encoding (percent-encoding) |
Frequently Asked Questions
Related Resources
Encryption Guides
Hashing and HMAC
Integrity, authentication, and when to use each
Authenticated Encryption and Integrity
AES-GCM, ChaCha20-Poly1305, AEAD deep dive
How Password Hashing Works
bcrypt, Argon2, PBKDF2 — salting, work factors, and secure password storage
What Is TLS?
Transport layer security, handshake, certificates, and common mistakes
Encryption at Rest vs In Transit
Threat models, technologies, key management, and real-world examples
What Is a Digital Signature?
How signing with a private key proves authorship and integrity
10 Common Encryption Mistakes
Passwords, hardcoded keys, missing HMAC, and other errors to avoid