Data Layer & Caching

Database design, ORMs, caching strategies, and data modeling best practices for scalable applications.

Data Layer & Caching

Database design, ORMs, caching strategies, and data modeling best practices for scalable applications.

A well-designed data layer is the foundation of every scalable application. We focus on data modeling, query optimization, caching strategies, and choosing the right database for your use case.

Database-first thinking

We start with data modeling — understanding relationships, access patterns, and consistency requirements — before writing code. This ensures your schema supports your application's needs as it grows.

Database Selection

We choose databases based on access patterns, consistency needs, and scale.

Database TypeWhen to UseExamples
Relational (SQL)Structured data, complex queries, transactionsPostgreSQL, MySQL
DocumentFlexible schema, nested data, fast readsMongoDB, DynamoDB
Key-ValueSession storage, caching, simple lookupsRedis, DynamoDB
GraphRelationships and traversals (social, recommendations)Neo4j, Amazon Neptune
Time-SeriesMetrics, logs, IoT dataTimescaleDB, InfluxDB
SearchFull-text search, autocomplete, aggregationsElasticsearch, Algolia

Relational Databases (SQL)

Our default choice for most applications — mature, transactional, and powerful.

PostgreSQL

Why we love Postgres:

  • ACID transactions: Strong consistency guarantees
  • Rich data types: JSON, arrays, UUID, full-text search
  • Extensions: PostGIS (geospatial), pg_vector (embeddings), TimescaleDB (time-series)
  • Performance: Indexes, materialized views, partitioning

Schema design best practices:

  • Normalization: Reduce redundancy (3NF for most tables)
  • Indexes: Add indexes on foreign keys and frequently queried columns
  • Constraints: Use UNIQUE, NOT NULL, CHECK constraints to enforce rules
  • Migrations: Version-controlled schema changes (Prisma Migrate, TypeORM migrations)
-- Example: E-commerce schema
CREATE TABLE users (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  email TEXT UNIQUE NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE orders (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID REFERENCES users(id) ON DELETE CASCADE,
  status TEXT NOT NULL CHECK (status IN ('pending', 'paid', 'shipped')),
  total_amount DECIMAL(10,2) NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_status ON orders(status);

ORMs & Query Builders

We use type-safe ORMs to prevent SQL injection and reduce boilerplate.

Prisma

Our preferred ORM for TypeScript projects — type-safe, migrations, and great DX.

// Schema definition
model User {
  id        String   @id @default(uuid())
  email     String   @unique
  orders    Order[]
  createdAt DateTime @default(now())
}

model Order {
  id        String   @id @default(uuid())
  userId    String
  user      User     @relation(fields: [userId], references: [id])
  status    String
  totalAmount Decimal
  createdAt DateTime @default(now())
}

// Usage in code (fully type-safe)
const orders = await prisma.order.findMany({
  where: { userId: 'user-123', status: 'paid' },
  include: { user: true },
  orderBy: { createdAt: 'desc' },
});

Drizzle ORM

Lightweight, SQL-like syntax, perfect for performance-critical apps.

const orders = await db
  .select()
  .from(ordersTable)
  .where(eq(ordersTable.userId, 'user-123'))
  .leftJoin(usersTable, eq(ordersTable.userId, usersTable.id));

ORM pitfalls

  • N+1 queries: Use include or joins to fetch related data
  • Over-fetching: Select only needed columns
  • Missing indexes: ORMs don't auto-create indexes — you must define them

Caching Strategies

Caching reduces database load and improves response times. We use multi-layer caching.

1. In-Memory Cache (Redis)

Cache layers:

Use cases:

  • Session storage
  • API response caching
  • Rate limiting
  • Pub/Sub for real-time features
// Cache API responses for 5 minutes
const cacheKey = `projects:${userId}`;
const cached = await redis.get(cacheKey);

if (cached) {
  return JSON.parse(cached);
}

const projects = await db.query.projects.findMany({ where: { userId } });
await redis.setex(cacheKey, 300, JSON.stringify(projects));

return projects;

2. Application-Level Cache

Use cases:

  • Rarely changing config or reference data
  • Computed results (e.g., aggregations)
import { LRUCache } from 'lru-cache';

const cache = new LRUCache<string, any>({
  max: 500, // Store up to 500 items
  ttl: 1000 * 60 * 5, // 5 minutes
});

function getCachedConfig() {
  const cached = cache.get('app-config');
  if (cached) return cached;

  const config = loadConfigFromDB();
  cache.set('app-config', config);
  return config;
}

3. CDN / Edge Caching

Use cases:

  • Static assets (images, CSS, JS)
  • API responses for public, cacheable endpoints

Headers for caching:

Cache-Control: public, max-age=3600, s-maxage=86400
ETag: "abc123"

4. Database Query Cache

Use cases:

  • Read-heavy queries (e.g., product catalogs)
  • Materialized views for complex aggregations
-- Materialized view for fast dashboard queries
CREATE MATERIALIZED VIEW user_stats AS
SELECT
  u.id,
  u.email,
  COUNT(o.id) as order_count,
  SUM(o.total_amount) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.email;

-- Refresh periodically
REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats;

Caching Patterns

PatternWhen to UseInvalidation Strategy
Cache-AsideMost common; app checks cache firstTTL + manual invalidation on write
Write-ThroughCache is updated on every writeNo invalidation needed
Write-BehindHigh write throughputBackground sync to DB
Read-ThroughCache automatically loads on missTTL + eviction policies

Cache-Aside (most common):

Cache invalidation

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

We use TTL (time-to-live) + event-based invalidation to keep caches fresh.

Data Modeling Best Practices

Think in Access Patterns

Design your schema based on how data will be queried, not just how it's structured.

Use Indexes Wisely

Index foreign keys and frequently filtered columns, but avoid over-indexing (slows writes).

Denormalize When Needed

For read-heavy apps, store computed data (e.g., order totals) to avoid joins.

Partition Large Tables

Split tables by date, region, or tenant for better query performance.

Explore More