Module F-2·18 min read

Why Redis Strings are not strings — they are binary-safe byte arrays. Integer encoding, atomic INCR/DECR, the GET/SET command family, MGET/MSET for batching, and why storing JSON blobs is the first mistake engineers make.

F-2 — Strings, Numbers, and Binary Safety

Q: A developer is storing millions of user IDs in Redis as strings (e.g., `SET user_id "1004593"`). Upon inspecting the memory usage, they realize the keys are taking up far less space than expected. When they run `OBJECT ENCODING user_id`, what will Redis return, and why?

`"int"`, because Redis automatically detects that the string value can be parsed as a 64-bit signed integer and stores it natively as an 8-byte integer to save memory. — Redis intelligently optimizes internal storage. Even if you pass the value wrapped in quotes as a string, if Redis determines the value fits perfectly within a 64-bit signed integer range, it will automatically encode the value using the `"int"` encoding. This uses exactly 8 bytes of memory, omitting the string header overhead. This is why numeric IDs or counters are incredibly memory-efficient in Redis compared to short text strings (which would use `"embstr"` encoding).

Q: An application uses Redis for distributed locking to prevent concurrent cron jobs from running. The original code uses `SETNX lock_key "job_server_1"`, followed by `EXPIRE lock_key 30`. Why is this pattern dangerous, and what is the modern solution?

There is a race condition. If the application crashes between the `SETNX` and `EXPIRE` commands, the lock will be held forever (no TTL), halting the system. The solution is to use the atomic unified command: `SET lock_key "job_server_1" NX EX 30`. — The two-command sequence `SETNX` + `EXPIRE` is fundamentally flawed in distributed systems because it is not atomic. If the network drops or the server crashes exactly between those two commands, the key exists indefinitely without an expiration. This creates a permanent deadlock. Since Redis 2.6.12, the `SET` command was updated to include conditional options (`NX` or `XX`) and expiration options (`EX` or `PX`) all within a single, atomic command. The old `SETNX` command is now considered a legacy anti-pattern for locking.

Q: A team is building a high-traffic e-commerce cart. They currently store the entire cart state as a JSON string: `SET cart:1001 '{"items": 5, "total": 100.50, "status": "active"}'`. When a user adds an item, they run `GET cart:1001`, parse the JSON, increment the items, stringify it, and call `SET cart:1001 new_json`. What are the two primary production risks with this architecture?

It introduces a race condition where concurrent requests overwrite each other (lost updates), and reading/writing large JSON blobs to update a single integer wastes CPU and network bandwidth. — The "fetch, modify, save" pattern with JSON strings is a classic anti-pattern. First, if two web requests execute the `GET` command simultaneously, they will both increment the same base value and then execute `SET`, resulting in a lost update (the cart total will be wrong). Second, you are paying the serialization, network transfer, and memory overhead of moving the entire JSON blob just to update one small value inside it. For structured data needing partial updates or atomic numeric increments, Redis Hashes (`HINCRBY`) are vastly superior.

Who this module is for: You have used SET and GET and assumed you understood Redis Strings. This module will show you what is actually happening under the hood — how Redis encodes integers differently from text, why "String" is a misleading name for what is really a binary-safe byte array, and why the seemingly obvious pattern of SET key (JSON.stringify(obj)) creates problems you will only discover under load.

What a Redis String Actually Is

The name "String" is the first thing Redis gets slightly wrong. When most engineers hear "string," they think of text — a sequence of Unicode characters with an encoding like UTF-8.

A Redis String is not that.

A Redis String is a binary-safe byte array — a sequence of raw bytes with no imposed encoding, no null-termination requirement, and no character set assumption. It can hold:

Plain text: "Hello, Redis"
A serialized JSON object: "{\"id\":1,\"name\":\"Jatin\"}"
A serialized Protocol Buffer
A JPEG image
A packed binary struct
An integer: "42"
An empty string: ""

The maximum size is 512 MB per key. In practice, values larger than a few kilobytes start to create problems (large values block the event loop during serialization, consume significant memory, and become expensive to transfer over the network), but the hard limit is 512 MB.

The "binary-safe" property matters because many key-value systems from Redis's era (early 2000s Memcached, for example) used C-style null-terminated strings, which meant you could not store arbitrary binary data — a null byte would terminate the string early. Redis stores the length alongside the data, so a null byte in the middle of a value is perfectly valid.

How Redis Encodes Strings Internally

Here is something most Redis users never learn: Redis does not store all String values the same way internally. It uses three different encodings depending on the value:

`int` encoding

If the value is an integer that fits in a 64-bit signed long (roughly -9.2 × 10¹⁸ to 9.2 × 10¹⁸), Redis stores the actual integer, not a string representation of it.

text

127.0.0.1:6379> SET counter 42
OK
127.0.0.1:6379> OBJECT ENCODING counter
"int"

This is significant for two reasons:

Memory efficiency. An integer stored as int encoding takes 8 bytes. The same number stored as a string "42" would take 2 bytes of data plus string header overhead. For large integers like Unix timestamps or IDs, the difference is meaningful at scale.
Atomic arithmetic. INCR, DECR, INCRBY, DECRBY only work on keys with int encoding (or keys whose value is a string representation of an integer). Redis can parse and operate on them atomically.

Additionally, Redis maintains a shared integer pool for the integers 0 through 9999. When you store any of these values, Redis does not allocate new memory — it points to a pre-allocated shared object. This is why OBJECT REFCOUNT on small integers returns a large number.

text

127.0.0.1:6379> SET x 100
OK
127.0.0.1:6379> OBJECT REFCOUNT x
(integer) 2147483647   # shared object, refcount saturated

`embstr` encoding

For strings up to 44 bytes, Redis uses embstr (embedded string) encoding. The string header and the data are allocated in a single contiguous memory block, making it cache-friendly and reducing allocator overhead.

text

127.0.0.1:6379> SET name "Jatin Jain Saraf"
OK
127.0.0.1:6379> OBJECT ENCODING name
"embstr"

embstr objects are immutable — any modification (like APPEND) causes Redis to convert the encoding to raw and reallocate.

`raw` encoding

For strings longer than 44 bytes, Redis uses raw encoding: a standard dynamic string (SDS — Simple Dynamic String) where the header and data are in separate memory allocations.

text

127.0.0.1:6379> SET longkey "This string is definitely longer than forty-four bytes total"
OK
127.0.0.1:6379> OBJECT ENCODING longkey
"raw"

Why does this matter? Because encoding determines memory usage and performance characteristics. If you are storing millions of short strings (session tokens, feature flags, user IDs), embstr gives you better cache locality. If you are storing large JSON blobs, raw encoding is unavoidable — and that is where the "storing JSON in a String" pattern starts to cost you.

The SET Command in Full

Most engineers know SET key value. The full signature is considerably richer:

SET key value [NX | XX] [GET] [EX seconds | PX milliseconds | EXAT unix-time-seconds | PXAT unix-time-milliseconds | KEEPTTL]

Let us go through each option:

Expiry options

bash

# Expire in 3600 seconds (1 hour)
SET session:user:1001 "token_abc" EX 3600

# Expire in 3600000 milliseconds (1 hour, more precise)
SET session:user:1001 "token_abc" PX 3600000

# Expire at a specific Unix timestamp (seconds)
SET promo:summer2025 "active" EXAT 1751299200

# Expire at a specific Unix timestamp (milliseconds)
SET promo:summer2025 "active" PXAT 1751299200000

# Preserve the existing TTL (don't reset it on update)
SET session:user:1001 "new_token" KEEPTTL

KEEPTTL is underused. It lets you update a value without accidentally removing the TTL. Without it, SET on a key that already has a TTL will reset the TTL to infinite (persistent) — a common source of session expiry bugs.

Conditional options

bash

# NX — only set if key does NOT exist
SET lock:resource:42 "owner_uuid" NX EX 30
# Returns OK if the key was set, (nil) if it already existed

# XX — only set if key DOES exist
SET user:1001:name "Updated Name" XX
# Returns OK if the key existed and was updated, (nil) if it did not exist

NX is the foundation of distributed locking (covered in depth in A-5). SET key value NX EX seconds is the atomic "acquire a lock" primitive — it either sets the key and returns OK (lock acquired) or returns (nil) (lock already held).

The old pattern of SETNX + EXPIRE as two separate commands is broken — if the process crashes between the two commands, the key has no expiry and the lock is never released. SET ... NX EX is atomic. Always use it.

GET option (Redis 6.2+)

bash

# Set the new value and return the OLD value atomically
SET user:1001:status "offline" GET
# Returns the previous value of user:1001:status (e.g., "online")
# or (nil) if the key did not previously exist

This is equivalent to the old GETSET command (which is now deprecated), but integrated into SET itself.

The GET Command Family

Beyond plain GET, Redis provides several atomic read-and-modify commands:

GETDEL

bash

# Get the value and delete the key atomically
GETDEL session:user:1001
# Returns the value and removes the key in one operation

Useful for one-time tokens: generate a token, store it in Redis, consume it (get-and-delete) exactly once. No race condition between reading and deleting.

GETEX

bash

# Get the value and set/modify its expiry atomically
GETEX session:user:1001 EX 3600      # reset TTL to 1 hour on access
GETEX session:user:1001 PERSIST      # remove TTL (make persistent)
GETEX session:user:1001 EXAT 1751299200  # set absolute expiry

GETEX is the correct way to implement sliding session expiry: every time the user makes a request, read their session and extend its TTL by another hour. Without GETEX, you would need a GET followed by an EXPIRE — two commands, two round-trips, and a race condition window.

GETSET (deprecated, use SET ... GET)

bash

# Old pattern — still works, but SET ... GET is preferred
GETSET key newvalue

SETNX / SETEX (deprecated)

bash

# Old: SETNX key value  →  New: SET key value NX
# Old: SETEX key seconds value  →  New: SET key value EX seconds

These still work but are considered legacy. Use the unified SET with options.

Batch Operations: MSET and MGET

Every Redis command requires a network round-trip. If you need to set or get 10 keys, 10 individual SET/GET calls means 10 round-trips. At 1ms per round-trip, that is 10ms of pure network latency.

MSET and MGET solve this:

bash

# Set multiple keys in one command
MSET user:1001:name "Jatin" user:1001:email "jatin@example.com" user:1001:role "admin"
# OK

# Get multiple keys in one command
MGET user:1001:name user:1001:email user:1001:role
# 1) "Jatin"
# 2) "jatin@example.com"
# 3) "admin"

MSET is atomic in the sense that all keys are set together — no other command can see a partial write where some keys are set and others are not. However, it is not atomic in the transactional sense: if the server crashes mid-MSET, some keys may have been persisted and others not.

MGET returns values in the same order as the keys you requested. If a key does not exist, it returns nil for that position:

bash

MGET user:1001:name user:9999:name
# 1) "Jatin"
# 2) (nil)

When to use MGET vs a Hash: If you are storing multiple fields for the same entity (user profile, product metadata), a Hash (HSET/HMGET) is usually better than multiple String keys. A Hash stores all fields under one key, keeping the entity together, supporting partial reads, and using more memory-efficient encoding for small field counts. The String multi-key pattern makes more sense when the fields are accessed independently rather than together.

MSETNX

bash

# Set multiple keys only if NONE of them exist
MSETNX key1 val1 key2 val2 key3 val3
# Returns 1 if all keys were set, 0 if any key already existed (none are set)

MSETNX is all-or-nothing: either every key is set, or none are. Useful for initializing a group of related keys atomically.

Atomic Integer Operations

Redis's integer commands are one of its most useful features and most underused outside of simple counters.

INCR and DECR

bash

SET page:views:homepage 0

INCR page:views:homepage   # → 1
INCR page:views:homepage   # → 2
INCR page:views:homepage   # → 3

DECR page:views:homepage   # → 2

INCR atomically increments the integer value by 1 and returns the new value. DECR decrements by 1. They fail with an error if the value is not an integer or exceeds the 64-bit signed long range.

The atomicity guarantee is important: if 1,000 concurrent clients all call INCR on the same key simultaneously, Redis processes each one serially (single-threaded). The final value will be exactly 1,000 — no lost updates, no race conditions, no need for locks. This is impossible to replicate safely with GET + application-side increment + SET.

bash

# This is WRONG — race condition between GET and SET
val = redis.get('counter')
redis.set('counter', int(val) + 1)   # Two clients can both read 5, both write 6

# This is CORRECT — atomic
redis.incr('counter')

INCRBY and DECRBY

bash

INCRBY page:views:homepage 10   # add 10
DECRBY stock:item:42 3          # subtract 3

INCRBYFLOAT

bash

SET product:price:42 19.99
INCRBYFLOAT product:price:42 0.50   # → "20.49"
INCRBYFLOAT product:price:42 -2.00  # → "18.49"

INCRBYFLOAT works with floating-point values. It stores the result as a String (the encoding is always embstr or raw after a float operation, never int). The precision is limited to 17 significant digits. For financial data, use integers (store prices in cents) rather than floats.

The Counter Pattern in Production

Counters are the simplest Redis use case but appear everywhere:

bash

# Rate limiting: how many API calls has this user made in this window?
INCR rate:api:user:1001:2025-05-29:14    # current hour bucket
EXPIRE rate:api:user:1001:2025-05-29:14 3600

# Page view tracking
INCR stats:page:homepage:views

# Inventory management
DECRBY stock:product:42 1

# Unique ID generation
INCR global:order:id   # always returns a unique, monotonically increasing ID

The ID generation pattern (INCR global:order:id) is worth highlighting: a single Redis INCR call gives you a unique, monotonically increasing integer ID without a database sequence or UUID generation. It is simple, fast, and works at high throughput — with the caveat that it is not durable without persistence configured.

APPEND: Building Strings Incrementally

bash

SET log:session:1001 "2025-05-29T14:00:00 login"
APPEND log:session:1001 "\n2025-05-29T14:05:00 page_view"
APPEND log:session:1001 "\n2025-05-29T14:10:00 logout"

GET log:session:1001
# "2025-05-29T14:00:00 login\n2025-05-29T14:05:00 page_view\n2025-05-29T14:10:00 logout"

APPEND adds bytes to the end of an existing String value and returns the new length. If the key does not exist, it creates it first.

The use case is narrow: building up a string incrementally without reading it back each time. For log-like data at real scale, Redis Streams (F-7) are more appropriate because they support consumer groups and persistent delivery. But for simple, short-lived aggregations, APPEND works cleanly.

STRLEN: Getting the Length

bash

SET name "Jatin Jain Saraf"
STRLEN name   # → 16

STRLEN returns the length of the string value in bytes, not in characters. For ASCII text this is the same. For multi-byte UTF-8 characters, a single character may be 2–4 bytes, so STRLEN can return a larger number than the character count.

SUBSTR and GETRANGE: Reading a Substring

bash

SET greeting "Hello, Redis World"
GETRANGE greeting 0 4    # → "Hello"
GETRANGE greeting 7 11   # → "Redis"
GETRANGE greeting -5 -1  # → "World"  (negative index from end)

GETRANGE (historically also called SUBSTR) returns a substring without reading the entire value. Useful for reading a specific field from a fixed-format binary record stored as a Redis String, or for paging through large stored text.

The JSON Blob Anti-Pattern

Here is the pattern you will see in almost every Redis tutorial:

javascript

// "Cache" a user object
const user = await db.query('SELECT * FROM users WHERE id = $1', [userId]);
await redis.set(`user:${userId}`, JSON.stringify(user), 'EX', 3600);

// Retrieve it
const cached = await redis.get(`user:${userId}`);
return JSON.parse(cached);

This works. It is also the source of several production problems that only appear at scale:

Problem 1: You cannot update a single field.

If the user changes their email, you must fetch the entire blob, deserialize it, update the field, re-serialize it, and write it back. Meanwhile another request might be doing the same thing. The update is not atomic — two concurrent updates can clobber each other.

Problem 2: You read the entire object even when you need one field.

If a middleware only needs user.role for authorization, you still deserialize a 2KB JSON blob to read a 10-byte string.

Problem 3: Serialization/deserialization CPU cost.

At 50,000 requests per second, deserializing 2KB JSON objects adds up. It is not Redis that is slow — it is your application spending CPU cycles on JSON parsing.

Problem 4: Large values block the event loop.

Redis's single-threaded architecture means reading or writing a very large String blocks all other commands. A 512KB JSON blob is a problem. A 50KB blob in a high-QPS system is worth thinking about.

The alternative: Use a Hash (HSET/HGET/HMGET) for structured objects. Redis Hashes store field-value pairs under a single key, support partial reads, and allow atomic field updates. Covered fully in F-3.

That said, JSON strings are not always wrong. They are appropriate when:

The entire object is always read together
The object is small (< 1–2 KB)
You need to store arbitrary, dynamic structure that does not fit a fixed field schema
You are using a language/library where Hash operations are awkward

The point is not to avoid JSON strings — it is to understand the trade-off and make the choice deliberately.

OBJECT ENCODING: Inspecting What Redis Actually Stores

As you work with Strings, use OBJECT ENCODING to verify what Redis is actually doing:

bash

SET a 42
OBJECT ENCODING a           # "int"

SET b "short text"
OBJECT ENCODING b           # "embstr"

SET c "this string is longer than forty-four bytes total yes"
OBJECT ENCODING c           # "raw"

APPEND b " more text"       # modifying embstr converts it
OBJECT ENCODING b           # "raw"

SET d 3.14
OBJECT ENCODING d           # "embstr" (floats are stored as strings)

This matters when diagnosing memory usage. A key you expect to use int encoding but accidentally set as a float will use embstr instead, consuming more memory per key when you have millions of them.

Key Size Matters

Redis keys are themselves stored as Strings. Their size contributes to memory usage. Compare:

bash

# 47 bytes of key name
SET user:profile:metadata:full:content:v2:1001 "value"

# 12 bytes of key name
SET u:p:1001 "value"

At 10 million keys, the difference between a 47-byte key and a 12-byte key is 350 MB of RAM — just for key names.

Guidelines:

Keep keys under 20–30 bytes where possible
Use the : separator convention but abbreviate entity names for high-volume keys
For very high-volume keys (billions), consider numeric keys over string keys

The key name is not the place to be verbose. Your code comments are.

OBJECT FREQ and OBJECT IDLETIME

Two additional inspection commands round out what you can discover about a String key:

bash

OBJECT IDLETIME mykey
# (integer) 42   — seconds since this key was last accessed
# Used by LRU eviction to decide which keys to evict

OBJECT FREQ mykey
# (integer) 5    — access frequency counter (only meaningful with LFU eviction policy)

These are diagnostic tools you will use when investigating which keys Redis is choosing to evict (covered in F-8).

Summary

A Redis String is a binary-safe byte array, not a text string. It can hold arbitrary bytes up to 512 MB.
Redis uses three internal encodings: int for integers (most efficient), embstr for short strings ≤ 44 bytes (cache-friendly), raw for longer strings.
The full SET command supports expiry (EX, PX, EXAT, KEEPTTL), conditional writes (NX, XX), and atomic get-and-set (GET).
GETDEL and GETEX provide atomic read-and-modify without a round-trip.
MSET/MGET batch multiple operations into one round-trip — use them whenever you need multiple keys.
INCR/INCRBY/INCRBYFLOAT are atomic counter operations with no race conditions.
The JSON blob pattern works but has real costs at scale: no partial reads, no atomic field updates, serialization CPU, and event loop blocking for large values. Use Hashes for structured objects.
OBJECT ENCODING tells you what Redis is actually storing — use it when diagnosing memory usage.

Next: F-3 — Lists, Hashes, Sets, and Sorted Sets — where we cover the data structures that make Redis genuinely different from every other cache, including the internal encoding switches that determine your memory footprint at scale.

Knowledge Check

A developer is storing millions of user IDs in Redis as strings (e.g., SET user_id "1004593"). Upon inspecting the memory usage, they realize the keys are taking up far less space than expected. When they run OBJECT ENCODING user_id, what will Redis return, and why?

An application uses Redis for distributed locking to prevent concurrent cron jobs from running. The original code uses SETNX lock_key "job_server_1", followed by EXPIRE lock_key 30. Why is this pattern dangerous, and what is the modern solution?

A team is building a high-traffic e-commerce cart. They currently store the entire cart state as a JSON string: SET cart:1001 '{"items": 5, "total": 100.50, "status": "active"}'. When a user adds an item, they run GET cart:1001, parse the JSON, increment the items, stringify it, and call SET cart:1001 new_json. What are the two primary production risks with this architecture?

Test your knowledge with more question sets

PreviousModule F-1: What Is Redis and Why Does It Exist?Next Module F-3: Lists, Hashes, Sets, and Sorted Sets — Internal Encodings

Discussion

Join the discussion

Loading comments...

F-2 — Strings, Numbers, and Binary Safety

What a Redis String Actually Is

How Redis Encodes Strings Internally

int encoding

embstr encoding

raw encoding

The SET Command in Full

Expiry options

Conditional options

GET option (Redis 6.2+)

The GET Command Family

GETDEL

GETEX

GETSET (deprecated, use SET ... GET)

SETNX / SETEX (deprecated)

Batch Operations: MSET and MGET

MSETNX

Atomic Integer Operations

INCR and DECR

INCRBY and DECRBY

INCRBYFLOAT

The Counter Pattern in Production

APPEND: Building Strings Incrementally

STRLEN: Getting the Length

SUBSTR and GETRANGE: Reading a Substring

The JSON Blob Anti-Pattern

OBJECT ENCODING: Inspecting What Redis Actually Stores

Key Size Matters

OBJECT FREQ and OBJECT IDLETIME

Summary

Test your knowledge with more question sets

Discussion

`int` encoding

`embstr` encoding

`raw` encoding