Kling Avatar 2.0

Animates a person from a photo and synchronizes speech, natural expressions, and gestures with a voiceover. Useful when you need a speaking or singing presenter from one portrait.

Cost

from 6 tokens/s

VideoProvider: Kling

Run generation

IDEMPOTENCY_KEY="${IDEMPOTENCY_KEY:-$(uuidgen)}"
curl -X POST https://api.givon.ai/api/v1/generations \
  -H "Authorization: Bearer $GIVON_API_KEY" \
  -H "Idempotency-Key: $IDEMPOTENCY_KEY" \
  -H "Content-Type: application/json" \
  -d '{"type":"video","model":"kling-digital-human","input":{"prompt":"cinematic drone shot over a city at night","resolution":"720p","speakerImage":"asset://asset_...","speechAudio":"asset://asset_..."}}'

Input fields

* required

Field	Type	Default	Allowed
prompt`prompt` Performance direction such as emotion, pace and delivery.	string	—	string
resolution`resolution`	string	720p	720p, 1080p
Speaker image*`speakerImage` Asset input for the speakerImage slot.	string	—	image · asset, https, data
Speech audio*`speechAudio` Asset input for the speechAudio slot.	string	—	voiceover · asset, https, data

promptprompt

Performance direction such as emotion, pace and delivery.

Type: string
Default: —
Allowed: string

resolutionresolution

Type: string
Default: 720p
Allowed: 720p, 1080p

Speaker image*speakerImage

Asset input for the speakerImage slot.

Type: string
Default: —
Allowed: image · asset, https, data

Speech audio*speechAudio

Asset input for the speechAudio slot.

Type: string
Default: —
Allowed: voiceover · asset, https, data

Cost

from 6 tokens/s

720pdefault6 tokens/s

1080p12 tokens/s

The variant is selected automatically from request fields, so you do not need to send it.

Capabilities

Modesimage_to_video

Asset slotsspeakerImage:image*speechAudio:voiceover*

Run Kling Avatar 2.0

Get an API key and the same request shape will work across every model in the catalog.

Get API key Open in playground

Where to use this model

AI video generator Image to video