Funnels & Sessions
Stitch impression → detail → play with attribution windows; derive sessions from event streams.
TL;DR
Two related shapes. Funnels stitch a sequence of events (impression → detail → play) for the same user within an attribution window. Sessionspartition a single user's events into bursts separated by inactivity gaps. Both lean on self-joins over playback_events with time-window predicates.
Reach for this pattern when you see "rate of A → B", "within N minutes", "same session", or "time to X". The senior-interview signal is whether you control for fan-out: a naïve join can match one impression to many downstream events and triple-count it.
The skeleton
Funnel skeleton: aggregate to one row per impression-key (member, title, session) before joining downstream events. Sessionization skeleton lives further down inside q11's solution — same shape with LAG + gap > 30 min.
-- Funnel: A → B within an attribution window
WITH base AS (
-- 1) Dedup to one row per impression-key. MIN(event_ts) is the impression ts.
SELECT member_id, title_id, session_id,
MIN(event_ts)::TIMESTAMP AS imp_ts
FROM playback_events
WHERE entry_source = '<< SURFACE >>' -- e.g. 'homepage_row'
AND event_name = 'impression'
GROUP BY member_id, title_id, session_id
),
next_step AS (
-- 2) Find earliest downstream event within the window, then aggregate to
-- impression-key. Aggregating BEFORE counting is what kills fan-out.
SELECT b.member_id, b.title_id, b.session_id,
MIN(e.event_ts)::TIMESTAMP AS step_ts
FROM base b
JOIN playback_events e
ON e.member_id = b.member_id
AND e.title_id = b.title_id
AND e.session_id = b.session_id
AND e.event_name = '<< NEXT_EVENT >>' -- e.g. 'title_detail_view'
AND e.event_ts::TIMESTAMP BETWEEN b.imp_ts
AND b.imp_ts + INTERVAL '<< N >>' MINUTE
GROUP BY b.member_id, b.title_id, b.session_id
)
SELECT
COUNT(*) AS impressions,
COUNT(n.member_id) AS converted,
ROUND(100.0 * COUNT(n.member_id)
/ NULLIF(COUNT(*), 0), 2) AS conversion_pct
FROM base b
LEFT JOIN next_step n USING (member_id, title_id, session_id);The trap: Fan-out from multi-match joins
One impression event can match multiple downstream events that satisfy the time window. If you join and then COUNT(*) the result, you count that impression once per downstream event— conversion rates climb past 100% and you can't reproduce them.
The fix: aggregate before counting. Collapse each side of the join to one row per impression-key with MIN(event_ts) first. Now the join is impression-to-key, not impression-to-event-stream, and conversion counts make sense.
Wrong
-- ❌ Fan-out: one impression with 3 matching plays counts as 3
SELECT COUNT(*) FILTER (WHERE p.event_name = 'play_start')
/ COUNT(*) FILTER (WHERE i.event_name = 'impression') AS rate
FROM playback_events i
JOIN playback_events p
ON p.member_id = i.member_id
AND p.title_id = i.title_id
AND p.event_ts BETWEEN i.event_ts AND i.event_ts + INTERVAL 30 MINUTE
WHERE i.event_name = 'impression';Right
-- ✅ Aggregate impressions and plays to one row per key, then compare
WITH imps AS (
SELECT member_id, title_id, session_id,
MIN(event_ts)::TIMESTAMP AS imp_ts
FROM playback_events
WHERE event_name = 'impression'
GROUP BY member_id, title_id, session_id
),
plays AS (
SELECT i.member_id, i.title_id, i.session_id
FROM imps i
JOIN playback_events p
ON p.member_id = i.member_id
AND p.title_id = i.title_id
AND p.session_id = i.session_id
AND p.event_name = 'play_start'
AND p.event_ts::TIMESTAMP BETWEEN i.imp_ts
AND i.imp_ts + INTERVAL 30 MINUTE
GROUP BY i.member_id, i.title_id, i.session_id
)
SELECT COUNT(*) AS impressions,
(SELECT COUNT(*) FROM plays) AS plays;