Kling Avatar 2.0

Animates a person from a photo with voiceover: lip sync, natural expressions, and gestures. Useful when you need a speaking or singing presenter from one portrait.

Cost

from 6 tokens/s

VideoProvider: Kling
Run generation
curl -X POST https://api.givon.ai/api/v1/generations \
  -H "Authorization: Bearer $GIVON_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"type":"video","model":"kling-digital-human","input":{"prompt":"cinematic drone shot over a city at night","duration":2,"speakerImage":"asset://asset_...","speechAudio":"asset://asset_..."}}'

Input fields

* required
prompt*prompt

Scene / motion description.

Type
string
Default
Allowed
string
durationduration
Type
number
Default
2
Allowed
from 2 · up to 300 · step 1 seconds
Speaker image*speakerImage

Asset input for the speakerImage slot.

Type
string
Default
Allowed
image · asset, https, data
Speech audio*speechAudio

Asset input for the speechAudio slot.

Type
string
Default
Allowed
voiceover · asset, https, data

Cost

from 6 tokens/s
720pdefault6 tokens/s
1080p12 tokens/s

The variant is selected automatically from request fields, so you do not need to send it.

Capabilities

Modesimage_to_video
Asset slotsspeakerImage:image*speechAudio:voiceover*

Run Kling Avatar 2.0

Get an API key and the same request shape will work across every model in the catalog.

Kling Avatar 2.0 API - video generation · Givon AI