pg_repack
principle
pg_repack 1.5.0 – Reorganize tables in PostgreSQL databases with minimal locks
https://github.com/reorg/pg_repack
- create a log table to record changes made to the original table
- add a trigger onto the original table, logging INSERTs, UPDATEs and DELETEs into our log table
- create a new table containing all the rows in the old table
- build indexes on this new table
- apply all changes which have accrued in the log table to the new table
- swap the tables, including indexes and toast tables, using the system catalogs
- drop the original table
The basic idea is
- transport the existent data with a old snapshot
- record the incremental data into a table and replay the record
And this idea is so general that almost all online-ddl ability in PG(supported in extensions) takes the way.
details
Although the idea is so simple, there are many problems to challenge. Such as how to ensure there are no duplicated or lost data in both existent part and incremental part. So code-level details are shown below:
All the 7 step are manipulated through 2 connections: See function repack_one_table for detail:
- 
create a log table to record changes made to the original table 
- 
add a trigger onto the original table, logging INSERTs, UPDATEs and DELETEs into our log table - conn1starts a transaction and acquire an advisory lock to prevent potential conflict with other repack process
- get the AccessiveExclusivelock to the original table,tblfor example
- create the trigger on tbland the correspondinglog-tablewhere the incremental changes will be stored.
- (Just comments: If we release the exclusive lock here, we may not able to acquire a shared lock later if another process has gotten a exclusive lock in the interval, which can cause that we have no way to continue or revert what we have done. So we must acquire a lock during the whole process. 👌)
- conn2tries to acquire the- AccessiveSharedlock on- tbl. Since the- conn1’s transaction hasn’t finished, this lock acquisition will be blocked.
- conn1kill all connections that tries to perform a DDL operation, whose character is waiting for- AccessiveLock. Then,- conn1commits.
- Now conn2get theAccessiveSharedlock ontbl, which can ensure that no other dll operation ontbl✌️
 
- 
create a new table containing all the rows in the old table - 
conn1begins a serializable transaction( repeatable read, at least)
- 
conn1get thevxidsof current active transactions
- 
conn1delete all data intblwith the current snapshot (This means we don’t perform a “truncate” operation ). This is a very skillful technique:- 
The table shows the secret: tbl log table visible existent data empty invisible incremental data incremental data 
- 
All existent data is visible in tblthrough the current snapshot
- 
All incremental data is invisible in log tableandtbl(The latter one isn’t important
- 
So there is no lost or duplicated data 
 
- 
- 
conn1copies all data intblto a temp tabletbl-tmpfor example
- 
conn1commits
 
- 
- 
build indexes on this new table. (I don’t care this.) 
- 
apply all changes which have accrued in the log table to the new table - conn1apply at most 1000 records in- log-table, until- the remaining records are few. AND
- All transactions in vxidsfinish. This operation is to keep the ISOLATION, but it still has some accidence. #TODO
 
- (Just comments: Now we believe that there is few records in log-table.)
- conn2acquire the- AccessiveExclusivelock. Note that no other process can do that
- conn2apply all data in- log-table
 
- 
swap the tables, including indexes and toast tables, using the system catalogs - conn2swaps- relfilenodebetween- tbl-tmpand- tbl
- conn2commits
 
- 
drop the original table - conn1drop the current- tbl-tmp
- conn1analyze the current- tbl
- conn1release the advisory lock
 
written by mobilephone724