Introduction Memcached CAS (Check And Set) provides optimistic locking for concurrent updates. When two clients read the same key, modify it, and attempt to write back using CAS, one will succeed and the other will receive a `NOT_STORED` response with a CAS mismatch. Without proper retry logic, the second update is silently lost.

Symptoms - CAS operations returning `NOT_STORED` under concurrent load - Counter increments or decrement operations losing updates - Application data inconsistencies for concurrently updated values - `gets` returning different CAS tokens for the same key - Silent data loss when CAS failures are not handled

Common Causes - Multiple application instances updating the same key concurrently - No retry logic after CAS failure - Long time between `gets` and `cas` operations increasing conflict probability - Using CAS for high-frequency counters instead of atomic operations - Application not checking the CAS response status

Step-by-Step Fix 1. **Implement proper CAS retry logic": ```python import pymemcache

client = pymemcache.client.base.Client(('localhost', 11211))

def cas_update(key, transform_fn, max_retries=5): """Atomically update a value using CAS with retry""" for attempt in range(max_retries): # Get the value and CAS token result = client.gets(key) if result is None: # Key doesn't exist, create it client.set(key, transform_fn(None)) return True

value, cas_token = result

# Apply the transformation new_value = transform_fn(value)

# Attempt CAS update if client.cas(key, new_value, cas_token): return True

# CAS failed, retry import time time.sleep(0.01 * (2 ** attempt))

raise Exception(f"CAS update failed after {max_retries} retries")

# Usage: increment a counter cas_update('page_views:123', lambda v: (int(v or 0) + 1).encode()) ```

  1. 1.**Use atomic operations instead of CAS for counters":
  2. 2.```python
  3. 3.# Instead of CAS for counters, use incr/decr
  4. 4.client.set('counter:order:123', '0')
  5. 5.client.incr('counter:order:123') # Atomic increment
  6. 6.client.incr('counter:order:123', 5) # Increment by 5
  7. 7.`
  8. 8.**Use append/prepend for list-like operations":
  9. 9.```python
  10. 10.# Instead of CAS for appending to a list
  11. 11.client.append('log:session:123', b'\nnew log entry')
  12. 12.# Note: this is atomic in Memcached
  13. 13.`

Prevention - Use atomic `incr`/`decr` for counters instead of CAS read-modify-write - Keep the time between `gets` and `cas` as short as possible - Implement exponential backoff retry for CAS failures - Consider using Redis for workloads requiring complex atomic operations - Monitor CAS failure rates and alert when they exceed acceptable thresholds - Design data models to minimize concurrent writes to the same key - Use key sharding (e.g., `counter:order:123:shard_0`) for hot keys