Database design, ORMs, caching strategies, and data modeling best practices for scalable applications.

Data Layer & Caching

Database design, ORMs, caching strategies, and data modeling best practices for scalable applications.

A well-designed data layer is the foundation of every scalable application. We focus on data modeling, query optimization, caching strategies, and choosing the right database for your use case.

Database-first thinking

We start with data modeling — understanding relationships, access patterns, and consistency requirements — before writing code. This ensures your schema supports your application's needs as it grows.

Database Selection

We choose databases based on access patterns, consistency needs, and scale.

Database Type	When to Use	Examples
Relational (SQL)	Structured data, complex queries, transactions	PostgreSQL, MySQL
Document	Flexible schema, nested data, fast reads	MongoDB, DynamoDB
Key-Value	Session storage, caching, simple lookups	Redis, DynamoDB
Graph	Relationships and traversals (social, recommendations)	Neo4j, Amazon Neptune
Time-Series	Metrics, logs, IoT data	TimescaleDB, InfluxDB
Search	Full-text search, autocomplete, aggregations	Elasticsearch, Algolia

Relational Databases (SQL)

Our default choice for most applications — mature, transactional, and powerful.

PostgreSQL

Why we love Postgres:

ACID transactions: Strong consistency guarantees
Rich data types: JSON, arrays, UUID, full-text search
Extensions: PostGIS (geospatial), pg_vector (embeddings), TimescaleDB (time-series)
Performance: Indexes, materialized views, partitioning

Schema design best practices:

Normalization: Reduce redundancy (3NF for most tables)
Indexes: Add indexes on foreign keys and frequently queried columns
Constraints: Use UNIQUE, NOT NULL, CHECK constraints to enforce rules
Migrations: Version-controlled schema changes (Prisma Migrate, TypeORM migrations)

-- Example: E-commerce schema
CREATE TABLE users (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  email TEXT UNIQUE NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE orders (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID REFERENCES users(id) ON DELETE CASCADE,
  status TEXT NOT NULL CHECK (status IN ('pending', 'paid', 'shipped')),
  total_amount DECIMAL(10,2) NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_status ON orders(status);

ORMs & Query Builders

We use type-safe ORMs to prevent SQL injection and reduce boilerplate.

Prisma

Our preferred ORM for TypeScript projects — type-safe, migrations, and great DX.

// Schema definition
model User {
  id        String   @id @default(uuid())
  email     String   @unique
  orders    Order[]
  createdAt DateTime @default(now())
}

model Order {
  id        String   @id @default(uuid())
  userId    String
  user      User     @relation(fields: [userId], references: [id])
  status    String
  totalAmount Decimal
  createdAt DateTime @default(now())
}

// Usage in code (fully type-safe)
const orders = await prisma.order.findMany({
  where: { userId: 'user-123', status: 'paid' },
  include: { user: true },
  orderBy: { createdAt: 'desc' },
});

Drizzle ORM

Lightweight, SQL-like syntax, perfect for performance-critical apps.

const orders = await db
  .select()
  .from(ordersTable)
  .where(eq(ordersTable.userId, 'user-123'))
  .leftJoin(usersTable, eq(ordersTable.userId, usersTable.id));

ORM pitfalls

N+1 queries: Use include or joins to fetch related data
Over-fetching: Select only needed columns
Missing indexes: ORMs don't auto-create indexes — you must define them

Caching Strategies

Caching reduces database load and improves response times. We use multi-layer caching.

1. In-Memory Cache (Redis)

Cache layers:

Use cases:

Session storage
API response caching
Rate limiting
Pub/Sub for real-time features

// Cache API responses for 5 minutes
const cacheKey = `projects:${userId}`;
const cached = await redis.get(cacheKey);

if (cached) {
  return JSON.parse(cached);
}

const projects = await db.query.projects.findMany({ where: { userId } });
await redis.setex(cacheKey, 300, JSON.stringify(projects));

return projects;

2. Application-Level Cache

Use cases:

Rarely changing config or reference data
Computed results (e.g., aggregations)

import { LRUCache } from 'lru-cache';

const cache = new LRUCache<string, any>({
  max: 500, // Store up to 500 items
  ttl: 1000 * 60 * 5, // 5 minutes
});

function getCachedConfig() {
  const cached = cache.get('app-config');
  if (cached) return cached;

  const config = loadConfigFromDB();
  cache.set('app-config', config);
  return config;
}

3. CDN / Edge Caching

Use cases:

Static assets (images, CSS, JS)
API responses for public, cacheable endpoints

Headers for caching:

Cache-Control: public, max-age=3600, s-maxage=86400
ETag: "abc123"

4. Database Query Cache

Use cases:

Read-heavy queries (e.g., product catalogs)
Materialized views for complex aggregations

-- Materialized view for fast dashboard queries
CREATE MATERIALIZED VIEW user_stats AS
SELECT
  u.id,
  u.email,
  COUNT(o.id) as order_count,
  SUM(o.total_amount) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.email;

-- Refresh periodically
REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats;

Caching Patterns

Pattern	When to Use	Invalidation Strategy
Cache-Aside	Most common; app checks cache first	TTL + manual invalidation on write
Write-Through	Cache is updated on every write	No invalidation needed
Write-Behind	High write throughput	Background sync to DB
Read-Through	Cache automatically loads on miss	TTL + eviction policies

Cache-Aside (most common):

Cache invalidation

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

We use TTL (time-to-live) + event-based invalidation to keep caches fresh.

Data Modeling Best Practices

Think in Access Patterns

Design your schema based on how data will be queried, not just how it's structured.

Use Indexes Wisely

Index foreign keys and frequently filtered columns, but avoid over-indexing (slows writes).

Denormalize When Needed

For read-heavy apps, store computed data (e.g., order totals) to avoid joins.

Partition Large Tables

Split tables by date, region, or tenant for better query performance.

Data Layer & Caching

Data Layer & Caching

Database Selection

Relational Databases (SQL)

PostgreSQL

ORMs & Query Builders

Prisma

Drizzle ORM

Caching Strategies

1. In-Memory Cache (Redis)

2. Application-Level Cache

3. CDN / Edge Caching

4. Database Query Cache

Caching Patterns

Data Modeling Best Practices

Think in Access Patterns

Use Indexes Wisely

Denormalize When Needed

Partition Large Tables

Explore More

Backend Stacks

APIs & Services

Integrations

On this page