How to create a data dictionary for a product analytics team requires defining four elements for every metric and event: a precise definition in plain language, the calculation or tracking logic, the owner responsible for accuracy, and the last-verified date — without all four, a data dictionary is a wishlist, not a source of truth.
Product analytics teams fail at data dictionaries for one of two reasons: either they never build one (everyone learns definitions informally and accumulates contradictory mental models), or they build one but don't maintain it (it becomes a historical artifact that new team members ignore because it's wrong half the time).
This guide shows you how to build one that stays alive.
What Is a Product Analytics Data Dictionary?
A data dictionary is a structured reference document that defines every metric, event, and dimension used in product analytics. Its purpose is to ensure everyone on the product, engineering, and data teams is talking about the same thing when they say "active user" or "conversion" or "engagement."
The Four Required Fields for Every Entry
Every entry in your data dictionary must include:
- Name: The canonical term used in dashboards, queries, and discussions
- Definition: Plain language description of what it measures and why it matters
- Calculation or tracking logic: Exactly how it is computed or instrumented
- Owner + last verified date: Who is responsible for accuracy and when it was last checked
Without the owner and last verified date, your dictionary will drift — entries become stale within 6 months of any significant product change.
Section 1: Event Dictionary
Events are the raw actions users take in your product. Define every tracked event with:
Event name: [snake_case_name]
Description: [What user action triggers this event?]
Triggered when: [Precise condition — e.g., "user clicks Save button on the
project settings page after making at least one change"]
Properties: [List of key-value pairs sent with the event]
Owner: [Engineering team or PM responsible for instrumentation accuracy]
Last verified: [Date]
Do not confuse with: [Similar events that are often mixed up]
Example entry:
Event name: project_saved
Description: User successfully saves changes to a project
Triggered when: Save button clicked AND API returns 200 success response
Properties: project_id, user_id, change_count, project_type
Owner: Core Product Engineering
Last verified: 2026-01-15
Do not confuse with: project_created (new project), project_autosaved (autosave, no user action)
Section 2: Metric Dictionary
Metrics are derived calculations built from events. Define each metric with:
Metric name: [canonical name as used in dashboards]
Definition: [What business question does this metric answer?]
Calculation: [Exact formula — e.g., (users with ≥1 project_saved in period) / (total active users)]
Numerator: [Definition of the numerator]
Denominator: [Definition of the denominator]
Granularity: [Daily, weekly, monthly]
Owner: [PM or analytics engineer responsible]
Last verified: [Date]
Common misuses: [How this metric is frequently misinterpreted]
According to Shreyas Doshi on Lenny's Podcast, metric clarity is one of the most underrated drivers of organizational alignment — when "active user" means different things to product, marketing, and finance, every performance review becomes a negotiation about definitions rather than a conversation about outcomes.
Section 3: Dimension Dictionary
Dimensions are the attributes used to segment and filter your metrics: plan tier, user role, device type, cohort, geography. Define each with:
- Name and allowed values
- Which system is the source of truth
- How it maps to your product constructs
- Known data quality issues or edge cases
Building and Maintaining the Dictionary
Step 1: Audit What Already Exists
Before writing new definitions, inventory what's currently tracked. Pull your event schema from your analytics tool (Amplitude, Mixpanel, Segment, or your data warehouse) and list every event and property.
Step 2: Interview Stakeholders for Each Major Metric
For each metric in your key dashboards, interview the PM, data analyst, and engineering lead: "How do you personally define this metric? How do you calculate it?" Document the disagreements — they reveal where ambiguity is causing downstream decisions to diverge.
Step 3: Write Definitions Collaboratively
Don't let one person write all definitions in isolation. Draft each definition and get explicit sign-off from the PM who uses it and the engineer who instruments it. Unreviewed definitions have a high error rate.
Step 4: Establish a Change Process
The dictionary needs a formal update process tied to your product development cycle:
- Any event or metric change requires a dictionary update before it ships
- Quarterly reviews of all entries to verify accuracy
- A deprecation process for retired events and metrics (don't delete — mark as deprecated with a redirect to the replacement)
According to Gibson Biddle on Lenny's Podcast, product teams that invest in metric hygiene — precise definitions, clear ownership, and regular audits — consistently outperform teams that treat data infrastructure as an afterthought, because clean data enables faster and more confident product decisions.
Step 5: Make It Discoverable
The best data dictionary is worthless if nobody can find it. Requirements:
- Single canonical location (not multiple Confluence pages and a Notion doc)
- Linked from every dashboard in your BI tool
- Mentioned in new PM and analyst onboarding
- Searchable by event name, metric name, and business concept
According to Elena Verna on Lenny's Podcast discussing growth infrastructure, teams that treat their data dictionary as a product — with ownership, maintenance, and discoverability — have dramatically higher analytical velocity than teams that rely on tribal knowledge about metric definitions.
FAQ
Q: What is a product analytics data dictionary? A: A structured reference document that defines every metric, event, and dimension used in product analytics, including calculation logic, ownership, and last verified date.
Q: What should a product analytics data dictionary include? A: Three sections: event dictionary (raw user actions), metric dictionary (derived calculations), and dimension dictionary (segmentation attributes). Each entry needs definition, calculation, owner, and last verified date.
Q: How do you maintain a data dictionary as your product changes? A: Require a dictionary update before any event or metric change ships. Conduct quarterly audits of all entries. Assign clear owners per section. Use a deprecation process rather than deleting retired entries.
Q: Who should own the product analytics data dictionary? A: Individual entries should have functional owners (PM or analytics engineer per metric). Overall dictionary governance is typically owned by the analytics engineering or data team, not delegated to a single person.
Q: What tools should you use to host a product analytics data dictionary? A: Confluence, Notion, or a dedicated data catalog tool like dbt Docs or DataHub. The key requirement is discoverability — it must be linked from dashboards and included in onboarding.
HowTo: Create a Data Dictionary for a Product Analytics Team
- Audit your existing event schema by pulling every tracked event and property from your analytics tool or data warehouse to understand what is currently instrumented
- Interview PMs, analysts, and engineers on how they personally define each major metric — document all disagreements as these reveal where definition ambiguity is causing divergent decisions
- Write event dictionary entries for every tracked event including description, trigger condition, properties, owner, and last verified date
- Write metric dictionary entries for every dashboard metric including plain language definition, exact calculation formula, numerator and denominator definitions, and common misuses
- Establish a change process requiring a dictionary update before any event or metric modification ships and quarterly audits to verify all entry accuracy
- Make the dictionary discoverable by linking it from every dashboard in your BI tool, hosting it in a single canonical location, and including it in new PM and analyst onboarding