Why Redis Strings are not strings — they are binary-safe byte arrays. Integer encoding, atomic INCR/DECR, the GET/SET command family, MGET/MSET for batching, and why storing JSON blobs is the first mistake engineers make.
F-2 — Strings, Numbers, and Binary Safety
Who this module is for: You have used SET and GET and assumed you understood Redis Strings. This module will show you what is actually happening under the hood — how Redis encodes integers differently from text, why "String" is a misleading name for what is really a binary-safe byte array, and why the seemingly obvious pattern of SET key (JSON.stringify(obj)) creates problems you will only discover under load.
What a Redis String Actually Is
The name "String" is the first thing Redis gets slightly wrong. When most engineers hear "string," they think of text — a sequence of Unicode characters with an encoding like UTF-8.
A Redis String is not that.
A Redis String is a binary-safe byte array — a sequence of raw bytes with no imposed encoding, no null-termination requirement, and no character set assumption. It can hold:
- Plain text:
"Hello, Redis" - A serialized JSON object:
"{\"id\":1,\"name\":\"Jatin\"}" - A serialized Protocol Buffer
- A JPEG image
- A packed binary struct
- An integer:
"42" - An empty string:
""
The maximum size is 512 MB per key. In practice, values larger than a few kilobytes start to create problems (large values block the event loop during serialization, consume significant memory, and become expensive to transfer over the network), but the hard limit is 512 MB.
The "binary-safe" property matters because many key-value systems from Redis's era (early 2000s Memcached, for example) used C-style null-terminated strings, which meant you could not store arbitrary binary data — a null byte would terminate the string early. Redis stores the length alongside the data, so a null byte in the middle of a value is perfectly valid.
How Redis Encodes Strings Internally
Here is something most Redis users never learn: Redis does not store all String values the same way internally. It uses three different encodings depending on the value:
int encoding
If the value is an integer that fits in a 64-bit signed long (roughly -9.2 × 10¹⁸ to 9.2 × 10¹⁸), Redis stores the actual integer, not a string representation of it.
127.0.0.1:6379> SET counter 42
OK
127.0.0.1:6379> OBJECT ENCODING counter
"int"
This is significant for two reasons:
-
Memory efficiency. An integer stored as
intencoding takes 8 bytes. The same number stored as a string"42"would take 2 bytes of data plus string header overhead. For large integers like Unix timestamps or IDs, the difference is meaningful at scale. -
Atomic arithmetic.
INCR,DECR,INCRBY,DECRBYonly work on keys withintencoding (or keys whose value is a string representation of an integer). Redis can parse and operate on them atomically.
Additionally, Redis maintains a shared integer pool for the integers 0 through 9999. When you store any of these values, Redis does not allocate new memory — it points to a pre-allocated shared object. This is why OBJECT REFCOUNT on small integers returns a large number.
127.0.0.1:6379> SET x 100
OK
127.0.0.1:6379> OBJECT REFCOUNT x
(integer) 2147483647 # shared object, refcount saturated
embstr encoding
For strings up to 44 bytes, Redis uses embstr (embedded string) encoding. The string header and the data are allocated in a single contiguous memory block, making it cache-friendly and reducing allocator overhead.
127.0.0.1:6379> SET name "Jatin Jain Saraf"
OK
127.0.0.1:6379> OBJECT ENCODING name
"embstr"
embstr objects are immutable — any modification (like APPEND) causes Redis to convert the encoding to raw and reallocate.
raw encoding
For strings longer than 44 bytes, Redis uses raw encoding: a standard dynamic string (SDS — Simple Dynamic String) where the header and data are in separate memory allocations.
127.0.0.1:6379> SET longkey "This string is definitely longer than forty-four bytes total"
OK
127.0.0.1:6379> OBJECT ENCODING longkey
"raw"
Why does this matter? Because encoding determines memory usage and performance characteristics. If you are storing millions of short strings (session tokens, feature flags, user IDs), embstr gives you better cache locality. If you are storing large JSON blobs, raw encoding is unavoidable — and that is where the "storing JSON in a String" pattern starts to cost you.
The SET Command in Full
Most engineers know SET key value. The full signature is considerably richer:
SET key value [NX | XX] [GET] [EX seconds | PX milliseconds | EXAT unix-time-seconds | PXAT unix-time-milliseconds | KEEPTTL]
Let us go through each option:
Expiry options
bash# Expire in 3600 seconds (1 hour) SET session:user:1001 "token_abc" EX 3600 # Expire in 3600000 milliseconds (1 hour, more precise) SET session:user:1001 "token_abc" PX 3600000 # Expire at a specific Unix timestamp (seconds) SET promo:summer2025 "active" EXAT 1751299200 # Expire at a specific Unix timestamp (milliseconds) SET promo:summer2025 "active" PXAT 1751299200000 # Preserve the existing TTL (don't reset it on update) SET session:user:1001 "new_token" KEEPTTL
KEEPTTL is underused. It lets you update a value without accidentally removing the TTL. Without it, SET on a key that already has a TTL will reset the TTL to infinite (persistent) — a common source of session expiry bugs.
Conditional options
bash# NX — only set if key does NOT exist SET lock:resource:42 "owner_uuid" NX EX 30 # Returns OK if the key was set, (nil) if it already existed # XX — only set if key DOES exist SET user:1001:name "Updated Name" XX # Returns OK if the key existed and was updated, (nil) if it did not exist
NX is the foundation of distributed locking (covered in depth in A-5). SET key value NX EX seconds is the atomic "acquire a lock" primitive — it either sets the key and returns OK (lock acquired) or returns (nil) (lock already held).
The old pattern of SETNX + EXPIRE as two separate commands is broken — if the process crashes between the two commands, the key has no expiry and the lock is never released. SET ... NX EX is atomic. Always use it.
GET option (Redis 6.2+)
bash# Set the new value and return the OLD value atomically SET user:1001:status "offline" GET # Returns the previous value of user:1001:status (e.g., "online") # or (nil) if the key did not previously exist
This is equivalent to the old GETSET command (which is now deprecated), but integrated into SET itself.
The GET Command Family
Beyond plain GET, Redis provides several atomic read-and-modify commands:
GETDEL
bash# Get the value and delete the key atomically GETDEL session:user:1001 # Returns the value and removes the key in one operation
Useful for one-time tokens: generate a token, store it in Redis, consume it (get-and-delete) exactly once. No race condition between reading and deleting.
GETEX
bash# Get the value and set/modify its expiry atomically GETEX session:user:1001 EX 3600 # reset TTL to 1 hour on access GETEX session:user:1001 PERSIST # remove TTL (make persistent) GETEX session:user:1001 EXAT 1751299200 # set absolute expiry
GETEX is the correct way to implement sliding session expiry: every time the user makes a request, read their session and extend its TTL by another hour. Without GETEX, you would need a GET followed by an EXPIRE — two commands, two round-trips, and a race condition window.
GETSET (deprecated, use SET ... GET)
bash# Old pattern — still works, but SET ... GET is preferred GETSET key newvalue
SETNX / SETEX (deprecated)
bash# Old: SETNX key value → New: SET key value NX # Old: SETEX key seconds value → New: SET key value EX seconds
These still work but are considered legacy. Use the unified SET with options.
Batch Operations: MSET and MGET
Every Redis command requires a network round-trip. If you need to set or get 10 keys, 10 individual SET/GET calls means 10 round-trips. At 1ms per round-trip, that is 10ms of pure network latency.
MSET and MGET solve this:
bash# Set multiple keys in one command MSET user:1001:name "Jatin" user:1001:email "jatin@example.com" user:1001:role "admin" # OK # Get multiple keys in one command MGET user:1001:name user:1001:email user:1001:role # 1) "Jatin" # 2) "jatin@example.com" # 3) "admin"
MSET is atomic in the sense that all keys are set together — no other command can see a partial write where some keys are set and others are not. However, it is not atomic in the transactional sense: if the server crashes mid-MSET, some keys may have been persisted and others not.
MGET returns values in the same order as the keys you requested. If a key does not exist, it returns nil for that position:
bashMGET user:1001:name user:9999:name # 1) "Jatin" # 2) (nil)
When to use MGET vs a Hash: If you are storing multiple fields for the same entity (user profile, product metadata), a Hash (HSET/HMGET) is usually better than multiple String keys. A Hash stores all fields under one key, keeping the entity together, supporting partial reads, and using more memory-efficient encoding for small field counts. The String multi-key pattern makes more sense when the fields are accessed independently rather than together.
MSETNX
bash# Set multiple keys only if NONE of them exist MSETNX key1 val1 key2 val2 key3 val3 # Returns 1 if all keys were set, 0 if any key already existed (none are set)
MSETNX is all-or-nothing: either every key is set, or none are. Useful for initializing a group of related keys atomically.
Atomic Integer Operations
Redis's integer commands are one of its most useful features and most underused outside of simple counters.
INCR and DECR
bashSET page:views:homepage 0 INCR page:views:homepage # → 1 INCR page:views:homepage # → 2 INCR page:views:homepage # → 3 DECR page:views:homepage # → 2
INCR atomically increments the integer value by 1 and returns the new value. DECR decrements by 1. They fail with an error if the value is not an integer or exceeds the 64-bit signed long range.
The atomicity guarantee is important: if 1,000 concurrent clients all call INCR on the same key simultaneously, Redis processes each one serially (single-threaded). The final value will be exactly 1,000 — no lost updates, no race conditions, no need for locks. This is impossible to replicate safely with GET + application-side increment + SET.
bash# This is WRONG — race condition between GET and SET val = redis.get('counter') redis.set('counter', int(val) + 1) # Two clients can both read 5, both write 6 # This is CORRECT — atomic redis.incr('counter')
INCRBY and DECRBY
bashINCRBY page:views:homepage 10 # add 10 DECRBY stock:item:42 3 # subtract 3
INCRBYFLOAT
bashSET product:price:42 19.99 INCRBYFLOAT product:price:42 0.50 # → "20.49" INCRBYFLOAT product:price:42 -2.00 # → "18.49"
INCRBYFLOAT works with floating-point values. It stores the result as a String (the encoding is always embstr or raw after a float operation, never int). The precision is limited to 17 significant digits. For financial data, use integers (store prices in cents) rather than floats.
The Counter Pattern in Production
Counters are the simplest Redis use case but appear everywhere:
bash# Rate limiting: how many API calls has this user made in this window? INCR rate:api:user:1001:2025-05-29:14 # current hour bucket EXPIRE rate:api:user:1001:2025-05-29:14 3600 # Page view tracking INCR stats:page:homepage:views # Inventory management DECRBY stock:product:42 1 # Unique ID generation INCR global:order:id # always returns a unique, monotonically increasing ID
The ID generation pattern (INCR global:order:id) is worth highlighting: a single Redis INCR call gives you a unique, monotonically increasing integer ID without a database sequence or UUID generation. It is simple, fast, and works at high throughput — with the caveat that it is not durable without persistence configured.
APPEND: Building Strings Incrementally
bashSET log:session:1001 "2025-05-29T14:00:00 login" APPEND log:session:1001 "\n2025-05-29T14:05:00 page_view" APPEND log:session:1001 "\n2025-05-29T14:10:00 logout" GET log:session:1001 # "2025-05-29T14:00:00 login\n2025-05-29T14:05:00 page_view\n2025-05-29T14:10:00 logout"
APPEND adds bytes to the end of an existing String value and returns the new length. If the key does not exist, it creates it first.
The use case is narrow: building up a string incrementally without reading it back each time. For log-like data at real scale, Redis Streams (F-7) are more appropriate because they support consumer groups and persistent delivery. But for simple, short-lived aggregations, APPEND works cleanly.
STRLEN: Getting the Length
bashSET name "Jatin Jain Saraf" STRLEN name # → 16
STRLEN returns the length of the string value in bytes, not in characters. For ASCII text this is the same. For multi-byte UTF-8 characters, a single character may be 2–4 bytes, so STRLEN can return a larger number than the character count.
SUBSTR and GETRANGE: Reading a Substring
bashSET greeting "Hello, Redis World" GETRANGE greeting 0 4 # → "Hello" GETRANGE greeting 7 11 # → "Redis" GETRANGE greeting -5 -1 # → "World" (negative index from end)
GETRANGE (historically also called SUBSTR) returns a substring without reading the entire value. Useful for reading a specific field from a fixed-format binary record stored as a Redis String, or for paging through large stored text.
The JSON Blob Anti-Pattern
Here is the pattern you will see in almost every Redis tutorial:
javascript// "Cache" a user object const user = await db.query('SELECT * FROM users WHERE id = $1', [userId]); await redis.set(`user:${userId}`, JSON.stringify(user), 'EX', 3600); // Retrieve it const cached = await redis.get(`user:${userId}`); return JSON.parse(cached);
This works. It is also the source of several production problems that only appear at scale:
Problem 1: You cannot update a single field.
If the user changes their email, you must fetch the entire blob, deserialize it, update the field, re-serialize it, and write it back. Meanwhile another request might be doing the same thing. The update is not atomic — two concurrent updates can clobber each other.
Problem 2: You read the entire object even when you need one field.
If a middleware only needs user.role for authorization, you still deserialize a 2KB JSON blob to read a 10-byte string.
Problem 3: Serialization/deserialization CPU cost.
At 50,000 requests per second, deserializing 2KB JSON objects adds up. It is not Redis that is slow — it is your application spending CPU cycles on JSON parsing.
Problem 4: Large values block the event loop.
Redis's single-threaded architecture means reading or writing a very large String blocks all other commands. A 512KB JSON blob is a problem. A 50KB blob in a high-QPS system is worth thinking about.
The alternative: Use a Hash (HSET/HGET/HMGET) for structured objects. Redis Hashes store field-value pairs under a single key, support partial reads, and allow atomic field updates. Covered fully in F-3.
That said, JSON strings are not always wrong. They are appropriate when:
- The entire object is always read together
- The object is small (< 1–2 KB)
- You need to store arbitrary, dynamic structure that does not fit a fixed field schema
- You are using a language/library where Hash operations are awkward
The point is not to avoid JSON strings — it is to understand the trade-off and make the choice deliberately.
OBJECT ENCODING: Inspecting What Redis Actually Stores
As you work with Strings, use OBJECT ENCODING to verify what Redis is actually doing:
bashSET a 42 OBJECT ENCODING a # "int" SET b "short text" OBJECT ENCODING b # "embstr" SET c "this string is longer than forty-four bytes total yes" OBJECT ENCODING c # "raw" APPEND b " more text" # modifying embstr converts it OBJECT ENCODING b # "raw" SET d 3.14 OBJECT ENCODING d # "embstr" (floats are stored as strings)
This matters when diagnosing memory usage. A key you expect to use int encoding but accidentally set as a float will use embstr instead, consuming more memory per key when you have millions of them.
Key Size Matters
Redis keys are themselves stored as Strings. Their size contributes to memory usage. Compare:
bash# 47 bytes of key name SET user:profile:metadata:full:content:v2:1001 "value" # 12 bytes of key name SET u:p:1001 "value"
At 10 million keys, the difference between a 47-byte key and a 12-byte key is 350 MB of RAM — just for key names.
Guidelines:
- Keep keys under 20–30 bytes where possible
- Use the
:separator convention but abbreviate entity names for high-volume keys - For very high-volume keys (billions), consider numeric keys over string keys
The key name is not the place to be verbose. Your code comments are.
OBJECT FREQ and OBJECT IDLETIME
Two additional inspection commands round out what you can discover about a String key:
bashOBJECT IDLETIME mykey # (integer) 42 — seconds since this key was last accessed # Used by LRU eviction to decide which keys to evict OBJECT FREQ mykey # (integer) 5 — access frequency counter (only meaningful with LFU eviction policy)
These are diagnostic tools you will use when investigating which keys Redis is choosing to evict (covered in F-8).
Summary
- A Redis String is a binary-safe byte array, not a text string. It can hold arbitrary bytes up to 512 MB.
- Redis uses three internal encodings:
intfor integers (most efficient),embstrfor short strings ≤ 44 bytes (cache-friendly),rawfor longer strings. - The full
SETcommand supports expiry (EX,PX,EXAT,KEEPTTL), conditional writes (NX,XX), and atomic get-and-set (GET). GETDELandGETEXprovide atomic read-and-modify without a round-trip.MSET/MGETbatch multiple operations into one round-trip — use them whenever you need multiple keys.INCR/INCRBY/INCRBYFLOATare atomic counter operations with no race conditions.- The JSON blob pattern works but has real costs at scale: no partial reads, no atomic field updates, serialization CPU, and event loop blocking for large values. Use Hashes for structured objects.
OBJECT ENCODINGtells you what Redis is actually storing — use it when diagnosing memory usage.
Next: F-3 — Lists, Hashes, Sets, and Sorted Sets — where we cover the data structures that make Redis genuinely different from every other cache, including the internal encoding switches that determine your memory footprint at scale.