CDM Rules of the Road

How to use the Common Data Model correctly


Why this exists

The CDM is designed to provide consistent, reliable, and scalable data across the business.

It is not a flat reporting dataset.

If used incorrectly, you will see:

  • duplicate records
  • broken joins
  • incorrect totals

These rules ensure you get the right answers.


1. Respect Domain Boundaries

Each domain has a specific purpose:

DomainPurpose
PersonIdentity (who someone is)
EmployeeEmployment (their role)
ProjectDelivery context
CustomerExternal relationships

Do not assume domains can be joined directly without understanding their role.


2. Never Join on OBJECT_SEQ Across Domains

OBJECT_SEQ is:

  • a domain key
  • designed for within-domain use only

❌ Do NOT use it to join:

  • Person ↔ Employee
  • Project ↔ Customer
  • Any cross-domain relationship

πŸ‘‰ This is the #1 cause of incorrect reporting


3. Use Canonical Keys for Linking

Cross-domain joins must use shared identity attributes.

For People data:

Use:

PERSON_UID

This ensures:

  • consistent identity
  • correct 1-to-many relationships

4. Expect One-to-Many Relationships

Most relationships in CDM are not 1:1.

Examples:

  • One Person β†’ Many Employees
  • One Project β†’ Many Activities
  • One Customer β†’ Many Orders

If your model assumes 1:1, it is likely wrong.


5. Understand the View Types

Each view type has a role:

TypePurpose
CoreOne row per object
Meta_CodesAttributes (categorical values)
Meta_DatesDates and periods
Meta_ValuesNumeric values
Item_*Lower-grain transactional detail

Do not mix these without understanding the grain


6. Align Grain Before Joining

Before joining two datasets, ask:

  • What is the row-level grain?
  • Is one side more detailed than the other?

If yes:

  • aggregate first
  • or expect duplication

7. Time Matters (Even When It’s Not Obvious)

Not all attributes are currently time-bound.

This means:

  • some values represent latest known state
  • not full history

Do not assume historical accuracy unless explicitly modelled


8. Duplicates Are a Signal, Not Always an Error

Duplicates usually mean one of:

  • βœ” valid multi-record scenario (expected)
  • ⚠ identity not resolved
  • ❌ incorrect join

Do not β€œfix” duplicates by:

  • removing rows
  • forcing DISTINCT

πŸ‘‰ Fix the model, not the symptom


9. Identity Is Critical

PERSON_UID is the canonical identity key

However:

  • identity depends on source data quality
  • some edge cases may exist

If identity looks wrong:

  • raise it
  • do not workaround it locally

10. Use the Model as Designed

The CDM is built with:

  • separation of concerns
  • scalable structure
  • future extensibility

Shortcuts will:

  • break consistency
  • create conflicting reports
  • increase rework later

11. When in Doubt β€” Ask Early

If something doesn’t look right:

  • unexpected duplicates
  • missing joins
  • unclear attributes

πŸ‘‰ Ask before building on top of it


Summary

  • Domains are separate by design
  • OBJECT_SEQ is not a universal key
  • Use PERSON_UID for people joins
  • Expect 1-to-many relationships
  • Always check grain before joining

Final Principle

Join on meaning, not convenience.

Correct joins produce correct insight.
Convenient joins produce misleading results.


Need Support?

Contact the Data Engineering team for:

  • modelling guidance
  • join validation
  • data clarification

Getting it right once is faster than fixing it later.


Straight advice

This page is more important than it looks.

If you:

  • pin it in KnowHow
  • reference it in reviews
  • link it in every challenge like the one you just had

…it becomes your defensive line.

Without it, you’ll keep having the same conversation in slightly different forms.

If you want to tighten this even further, next step is:

πŸ‘‰ a β€œTop 5 mistakes in Power BI using CDM” β€” very practical, very effective with analysts

Leave a Comment