Data Layer & Caching
Database design, ORMs, caching strategies, and data modeling best practices for scalable applications.
Data Layer & Caching
A well-designed data layer is the foundation of every scalable application. We focus on data modeling, query optimization, caching strategies, and choosing the right database for your use case.
Database-first thinking
We start with data modeling — understanding relationships, access patterns, and consistency requirements — before writing code. This ensures your schema supports your application's needs as it grows.
Database Selection
We choose databases based on access patterns, consistency needs, and scale.
| Database Type | When to Use | Examples |
|---|---|---|
| Relational (SQL) | Structured data, complex queries, transactions | PostgreSQL, MySQL |
| Document | Flexible schema, nested data, fast reads | MongoDB, DynamoDB |
| Key-Value | Session storage, caching, simple lookups | Redis, DynamoDB |
| Graph | Relationships and traversals (social, recommendations) | Neo4j, Amazon Neptune |
| Time-Series | Metrics, logs, IoT data | TimescaleDB, InfluxDB |
| Search | Full-text search, autocomplete, aggregations | Elasticsearch, Algolia |
Relational Databases (SQL)
Our default choice for most applications — mature, transactional, and powerful.
PostgreSQL
Why we love Postgres:
- ACID transactions: Strong consistency guarantees
- Rich data types: JSON, arrays, UUID, full-text search
- Extensions: PostGIS (geospatial), pg_vector (embeddings), TimescaleDB (time-series)
- Performance: Indexes, materialized views, partitioning
Schema design best practices:
- Normalization: Reduce redundancy (3NF for most tables)
- Indexes: Add indexes on foreign keys and frequently queried columns
- Constraints: Use
UNIQUE,NOT NULL,CHECKconstraints to enforce rules - Migrations: Version-controlled schema changes (Prisma Migrate, TypeORM migrations)
-- Example: E-commerce schema
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email TEXT UNIQUE NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
status TEXT NOT NULL CHECK (status IN ('pending', 'paid', 'shipped')),
total_amount DECIMAL(10,2) NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_status ON orders(status);ORMs & Query Builders
We use type-safe ORMs to prevent SQL injection and reduce boilerplate.
Prisma
Our preferred ORM for TypeScript projects — type-safe, migrations, and great DX.
// Schema definition
model User {
id String @id @default(uuid())
email String @unique
orders Order[]
createdAt DateTime @default(now())
}
model Order {
id String @id @default(uuid())
userId String
user User @relation(fields: [userId], references: [id])
status String
totalAmount Decimal
createdAt DateTime @default(now())
}
// Usage in code (fully type-safe)
const orders = await prisma.order.findMany({
where: { userId: 'user-123', status: 'paid' },
include: { user: true },
orderBy: { createdAt: 'desc' },
});Drizzle ORM
Lightweight, SQL-like syntax, perfect for performance-critical apps.
const orders = await db
.select()
.from(ordersTable)
.where(eq(ordersTable.userId, 'user-123'))
.leftJoin(usersTable, eq(ordersTable.userId, usersTable.id));ORM pitfalls
- N+1 queries: Use
includeorjoinsto fetch related data - Over-fetching: Select only needed columns
- Missing indexes: ORMs don't auto-create indexes — you must define them
Caching Strategies
Caching reduces database load and improves response times. We use multi-layer caching.
1. In-Memory Cache (Redis)
Cache layers:
Use cases:
- Session storage
- API response caching
- Rate limiting
- Pub/Sub for real-time features
// Cache API responses for 5 minutes
const cacheKey = `projects:${userId}`;
const cached = await redis.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
const projects = await db.query.projects.findMany({ where: { userId } });
await redis.setex(cacheKey, 300, JSON.stringify(projects));
return projects;2. Application-Level Cache
Use cases:
- Rarely changing config or reference data
- Computed results (e.g., aggregations)
import { LRUCache } from 'lru-cache';
const cache = new LRUCache<string, any>({
max: 500, // Store up to 500 items
ttl: 1000 * 60 * 5, // 5 minutes
});
function getCachedConfig() {
const cached = cache.get('app-config');
if (cached) return cached;
const config = loadConfigFromDB();
cache.set('app-config', config);
return config;
}3. CDN / Edge Caching
Use cases:
- Static assets (images, CSS, JS)
- API responses for public, cacheable endpoints
Headers for caching:
Cache-Control: public, max-age=3600, s-maxage=86400
ETag: "abc123"4. Database Query Cache
Use cases:
- Read-heavy queries (e.g., product catalogs)
- Materialized views for complex aggregations
-- Materialized view for fast dashboard queries
CREATE MATERIALIZED VIEW user_stats AS
SELECT
u.id,
u.email,
COUNT(o.id) as order_count,
SUM(o.total_amount) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.email;
-- Refresh periodically
REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats;Caching Patterns
| Pattern | When to Use | Invalidation Strategy |
|---|---|---|
| Cache-Aside | Most common; app checks cache first | TTL + manual invalidation on write |
| Write-Through | Cache is updated on every write | No invalidation needed |
| Write-Behind | High write throughput | Background sync to DB |
| Read-Through | Cache automatically loads on miss | TTL + eviction policies |
Cache-Aside (most common):
Cache invalidation
"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton
We use TTL (time-to-live) + event-based invalidation to keep caches fresh.
Data Modeling Best Practices
Think in Access Patterns
Design your schema based on how data will be queried, not just how it's structured.
Use Indexes Wisely
Index foreign keys and frequently filtered columns, but avoid over-indexing (slows writes).
Denormalize When Needed
For read-heavy apps, store computed data (e.g., order totals) to avoid joins.
Partition Large Tables
Split tables by date, region, or tenant for better query performance.