Azure Storage Avere Technology

High Performance Computing Applications

Obtain synchronization tokens
Fill attributes/data caches
- if necessary
- based on tokens held and file modify times
- writes cache miss data to CFS
  - CFS to hold data/attributes
    - Unix-like: inodes, indirs
    - Per-block metadata
  - read-ahead from disk when sequential IO detected
  - journaled for quick restrt
  - buffer cache holds hot data
  - tracks block status
    - flag if block from current mtime
Large CFS buffer cache
- memory cache for hot data
- read from disk for warm data
Read-ahead from back-end
- flag if block from current mtime
NFS backend -> pass through
CLFS backend -> get/put/enumerate objects
Token server
- returns read token
- optimistic get token
  - fails on conflict
  - client send reads to dirty node

Attribute tokens (read/write)
Data tokens (read range, write range)
Some tokens are persistent
- write tokens protect modified data
- after a crash, must prevent conflicting updates
Token manager knows location of all cached data
Ownership token eliminates need to store to backend
Forwarding tokens reduce thrashing
Sharded based on file handle for load-balancing