Oh mate, I feel the pain on this one. Been there, tried every combo under the sun and ended up with a spreadsheet that told me absolutely nothing. 😅 One thing I'd flag - you're spot on about testing combinations vs variables. That's the trap we all fall into.
From what you've shared, your data is basically saying "this ad set won" but not why. When visual length and copy length change together, you can't isolate the driver. Was it the long visual hooking people, or the short copy sealing the deal? No clue. So you're stuck optimising per ad set individually, which is fine if you're scaling winners, but don't try to read across vacancies - the job role itself is a bigger variable than format.
Here's what I'd test cleaner next time:
✅ Round 1 - lock copy length, test visual only (short vs long)
✅ Round 2 - take the winner, then test copy length
Takes a bit longer, but you actually get actionable insight instead of noise.
And on metrics - high CTR with low conversions? Classic wrong-audience magnet. For job ads especially, conversions should be the heavyweight champion, not CTR. 📊
Hope that helps declutter the next round!