Because Redshift does not automatically “reclaim” the space taken up by a deleted or updated row, occasionally you’ll need to resort your tables and clear out any unused space. Doing so can optimize performance and reduce the number of nodes you need to host your data (thereby reducing costs). This can be done using the VACUUM command.
The VACUUM command can only be run by a superuser or the owner of the table. Even worse, if you do not have those privileges, Redshift will tell you the command worked perfectly, while in reality, it will have no effect.
It’s really simple to do:
That command alone should handle 90%+ of cases, but there are a couple of caveats:
- In rare cases, the table size can only grow after you perform a VACUUM. This can be because there are no deleted rows that can be reclaimed and the new sort order results in a lower % of the data being able to be compressed.
- Generally speaking, it's a good idea to make sure that there are no WRITE operations going on while the VACUUM is running.
- You can't VACUUM within a BEGIN/END block.
And that's it! Feel free to reach out in the comments if you have any questions. :-)