I had this idea. What if I could train an AI model on my face so it could generate YouTube thumbnails for me? No more photoshoots, no more editing. Just type in a prompt and get a banger thumbnail that looks spot on. Or at the very least, be a quick way to come up with some "sketches" before I shoot.
So I tried it. I uploaded about 100 photos of myself to Replicate, trained a LoRA model, waited for it to finish, and started generating images.
The results were... disappointing. I mean it got the vibe right. The lighting, the background, my beanie. But my face looked... off. And the hands? Oof. I guess AI is still not quite there with hands.
To be fair, maybe my thumbnails were a little more complex than just my face. I have a lot of shots with my guitars, close-ups of my hands on the guitar, etc. I uploaded a ton of photo examples with the guitar and hands too, but yeah it just was not good. The guitar was laughable. In one photo it ended up looking like some Chinese instrument.
Then I tried something else. I used Google's Gemini image generation (Nano Banana) and just gave it a couple of my real photos as a reference. No training. No waiting. Just a simple prompt and a photo.
BOOM. Perfection. Light years better. The tool with zero training beat the model I spent hours fine-tuning on 100 photos.
Here's what I learned: fine-tuning an image model is great for brainstorming ideas. Sometimes I literally draw sketches on a pad before I shoot to see what shot I like. I could use this fine-tuned model for "sketches." But for an actual thumbnail that I would use? No way. Just use Nano Banana. It's faster, it's pennies per image, and it's way better.
← BACK TO POSTS