본문 바로가기

개발기술/영상처리

Encoding, Decoding, Frames 개념

효율적인 비디오 파이프라인 정리

단계설명필요한 요소
🎥 캡처 (입력) 카메라에서 실시간 영상 데이터를 가져옴 빠른 접근 가능한 장치 (예: /dev/video0, v4l2, avfoundation)
🎞 인코딩 원본 영상을 압축해서 전송/저장하기 좋게 변환 고효율 코덱 사용 (예: libx264, nvenc, vaapi 등 하드웨어 가속)
🧠 디코딩 압축된 영상 데이터를 다시 프레임으로 복원 하드웨어 가속 디코더 사용 시 성능 향상
🔄 트랜스코딩 / 스트리밍 스트리밍 형식(HLS 등)으로 재가공 적절한 키프레임 간격, GOP 설정, 버퍼 조절 등
📤 전달 (브라우저/플레이어로) .ts 세그먼트와 .m3u8 재생 목록을 클라이언트에 제공 HLS 세그먼트 생성, 재생 목록 실시간

 

 

What Are Video Frames?

When you watch a video, it’s really just a bunch of images (frames) shown rapidly. encoding/decoding is all about compressing and reconstructing those frames.

 

 

But in compressed video, not all frames are full images. Instead, they’re smartly encoded to save space.

 

📦 Encoding vs 🔓 Decoding

Step Description
Encoding Analyzes each frame and decides:
🟢 Store full frame (I-frame)
🟡 Store only the difference (P/B-frame)
Decoding Uses the stored info (I/P/B) to rebuild the full video stream

 

 

Frame Types 

Type Name What it Does
🟢 I-Frame Keyframe Full image frame. Needed to begin decoding.
🟡 P-Frame Predictive Uses previous frame for data.
🔵 B-Frame Bi-directional Uses before & after frames. (often skipped in low-latency systems)


ffprobe

to Check how many frames appear between "pict_type": "I" use ffprobe

ffprobe -show_frames -select_streams v -i rtsp://your-stream -print_format json | grep pict_type

GOP (Group of Pictures)

  A GOP is a set of frames that starts with an I-frame and includes the P-frames and/or B-frames that follow.

 

GOP = Keyframe + following P-frames.

Your setting -g 30 -keyint_min 30 = Keyframe every 30 frames.

So if FPS = 30 → keyframe every 1s. Good ✅

 

 

HLS/FFMPEG에서의  적용

 

Why I-frames Matter for HLS

  • Each .ts segment in HLS must start with an I-frame.
  • If not, the player can’t decode that segment from scratch.

So:

  • Long GOP = fewer I-frames → harder to split
  • Short GOP = more I-frames → easier segmenting, higher bitrate

Frame Frequency And 

If a player tries to jump (or switch bitrate) at a time where there’s no keyframe, it won’t work properly — it’ll result in:

  • ❌ Corrupted image (artifacts)
  • ❌ Decoder crash
  • ❌ Black screen
  • ❌ Or just ignored request

 

 

  • More keyframes = ✅ Fast, accurate seeking
  • But also = ⛔ Slightly larger file si

 

 

KeyFrame Trade Off ; FileSize vs Accurate Seeking

  • 2s is common for HLS/live stream balance
  • 1s is better for precise seeking (VOD)

Decoding Process

  • Always starts from an I-frame
  • Then decodes the following P and B frames using that base
  • ❌ Without an I-frame, the player cannot start playback or jump forward

 


When you play or process video (like from a webcam), it needs to be decoded — turning raw encoded video (e.g. H.264) into actual image frames.

There are two main ways to decode video:

TypeWhat It UsesPerformance
 Hardware decoding GPU or dedicated video chip Fast, smooth playback
 Software decoding CPU only Slower, may lag or drop frames