I'm no expert on VTTs, but I would thing that it depends on your VTT as much as anything else.
The usual file-size-reduction suggestions (reduce colors to indexed color levels, use lower compression quality, make non-critical areas solid color for better compression) may or may not be helpful depending on your specific image format and processing available. If you can identify moderate-sized blocks that are background, those areas can likely be reduced to instances of a single tiny image stretched to fill, but then you have to reassemble the image in the VTT. A similar idea is to have two layers (if allowed by your VTT) with the background texture repeating and the foreground image element(s) having a transparent background so that they might compress well.