There seems to be a lot of stuff done manually like wrapping and centering text or expanding the image. There must be an easier way.
My dumb-ass would've generated HTML, passed it through a controlled browser, taken a screenshot, and piped that output to a file. But as I said, there must be an easier way...
Possibly. You don't need to open a new browser each time. You can change the URL of a tab or open a new tab. But yeah, it probably isn't the best way to do it.