2026-06-03

RTCP CNAME and RTSP Audio/Video Sync Debugging: RTP Timestamps, Sender Reports, Lip Sync, and Track Drift

How to debug RTCP CNAME, RTP timestamp mapping, sender reports, audio/video synchronization, lip sync drift, and multi-track RTSP camera timing problems.

rtcp cname, audio video sync, rtsp lip sync, rtp timestamp, sender report, track drift, rtsp diagnostics

RTSP streams with both audio and video can connect successfully and still be unusable if audio and video drift apart. Users search for "RTSP audio video out of sync", "RTCP CNAME lip sync", "RTP timestamp drift", "camera audio delay", "RTCP sender report synchronization", and "RTSP video ahead of audio" when playback starts but timing feels wrong.

RTSP Inspector is useful because this is a protocol timing problem. The important evidence is not only the decoded media. It is the relationship between RTP timestamps, RTCP Sender Reports, SSRC values, CNAME values, track clocks, and arrival timing.

Why RTP timestamps alone are not enough

RTP timestamps are relative to each media clock. A video track may use a 90 kHz clock. An audio track may use 8 kHz, 16 kHz, 44.1 kHz, or 48 kHz depending on codec and SDP.

That means this is not enough:

video RTP timestamp: 900000
audio RTP timestamp: 480000

Those values are not directly comparable unless each track is mapped to a shared reference clock.

What RTCP Sender Reports provide

RTCP Sender Reports can map RTP timestamps to wall-clock time. For sync analysis, this is critical evidence.

A useful report should show:

  • SSRC for the video stream.
  • SSRC for the audio stream.
  • RTP timestamp in each Sender Report.
  • NTP timestamp in each Sender Report.
  • Whether reports continue during the session.
  • Whether sender reports jump or reset.
  • Whether CNAME values associate related streams.

When RTCP Sender Reports are missing, delayed, or inconsistent, lip sync may depend on receiver guessing.

What CNAME is for

RTCP CNAME is used to associate RTP streams that belong to the same synchronization context. For example, a camera's audio RTP stream and video RTP stream should normally be linkable as related media from the same endpoint.

Failures include:

  • Audio and video use different CNAME values unexpectedly.
  • CNAME changes mid-session.
  • One track sends RTCP SDES and the other does not.
  • Gateway rewrites SSRC but not CNAME consistently.
  • RTCP is blocked by firewall or NAT.
  • Camera sends RTP but no RTCP at all.

If the receiver cannot associate the tracks, audio/video sync can be unstable.

Lip sync drift

Audio/video sync can be wrong immediately or drift over time.

Immediate offset suggests:

  • Wrong start timestamp mapping.
  • Missing initial sender report.
  • Buffering differences.
  • Camera encoder pipeline delay.
  • Audio track starts before video track.

Gradual drift suggests:

  • Wrong clock rate in SDP.
  • Inaccurate RTP timestamp increment.
  • Camera clock instability.
  • Missing or inconsistent RTCP reports.
  • Gateway resampling or transcoding errors.

Those are different fixes, so the article should separate them.

Wrong clock rate in SDP

An SDP line such as a=rtpmap:97 MPEG4-GENERIC/48000/2 tells the receiver how to interpret audio timestamps. If the camera sends timestamps as if the clock were 44100 but advertises 48000, audio will drift.

Video can have similar problems when the clock rate is wrong or timestamp increments do not match frame cadence.

Evidence:

  • SDP rtpmap clock rate.
  • RTP timestamp deltas.
  • Packet arrival deltas.
  • Frame duration.
  • Sender report mapping.
  • Drift amount over minutes.

RTSP Inspector can turn "audio gradually out of sync" into a measurable timestamp problem.

RTCP blocked by transport path

With UDP transport, RTP and RTCP may use adjacent ports. Firewalls and NAT rules sometimes allow RTP but block RTCP. The video may still appear, but quality reports and synchronization evidence disappear.

Symptoms:

  • RTP packets arrive.
  • RTCP Sender Reports never arrive.
  • Jitter/loss stats are unavailable.
  • Audio/video sync depends on client heuristics.
  • TCP interleaved transport behaves differently.

If TCP interleaved fixes sync, the issue may be RTCP reachability rather than codec decode.

Gateway and restreamer issues

RTSP proxies, NVRs, and cloud bridges may rewrite SSRC, regenerate timestamps, or relay RTCP incorrectly.

Problems include:

  • Video SSRC changes but CNAME stays stale.
  • Audio and video come from different upstream sessions.
  • Sender Reports map to proxy time for one track and camera time for another.
  • Audio is transcoded while video is passed through.
  • RTCP SDES is dropped.

Packet evidence is the only reliable way to identify this boundary.

Debug checklist

Use this workflow:

  1. Capture SDP for all audio and video tracks.
  2. Record each track's payload type and clock rate.
  3. Identify RTP SSRC values.
  4. Capture RTCP Sender Reports.
  5. Capture RTCP SDES CNAME values.
  6. Compare audio and video synchronization context.
  7. Measure RTP timestamp deltas.
  8. Check whether drift is immediate or gradual.
  9. Compare UDP and TCP interleaved behavior.
  10. Preserve a long enough trace to measure drift.

Final diagnosis

RTSP audio/video sync problems require RTP and RTCP timing evidence. The key facts are clock rates, RTP timestamp deltas, RTCP Sender Reports, CNAME association, SSRC continuity, and whether RTCP reaches the receiver.

RTSP Inspector helps diagnose lip sync and track drift as protocol timing evidence rather than vague playback complaints.