I've been testing exactly this workflow. Pick one of the top models (Kling 3.0, Seedance 2.0 or Veo 3.1), generate a start frame from a character sheet you build, then write the scene/script around that frame. keep clips to 3-4 seconds and edit them together. Results look disturbingly realistic if you take the time on the sheet