Storage and database cleanup

Modified on Tue, 4 Jun, 2024 at 9:38 AM

Historical Launches data

What do we store in the database:

metadata for each test result
metadata for each launch
scenario data
fixtures data
links to the artefacts on S3

Built-in Cleanup procedures deletes artefacts and related records in the database

Built-in cleanup (see details here) is intended to delete the following data by defined rules.

- fixtures artefacts from S3 and their file references from the database

- test results attachments from S3 and their file references from the database

- tests results scenarios from the database and all their references from the database

What historical data is not deleted

Launches data

- launches metadata from the database

- each test result metadata

If you want to delete historical launches data, then by means of Allure TestOps it is not possible right now. There is standalone tool allowing the deletion of launched by AQL: https://github.com/eroshenkoam/allure-testops-utils#delete-launches-in-allure-testops

The tool needs to be used with caution as the deletion of a launch is a heavy operation against the database which during the high load times could dramatically degrade the performance even to buzz state.

Deleted data

The data you store in the DB contains records with the historical data, data related to the paths to the files etc.

Since there are many insert and delete operations, tables usually get bloated. The PostgreSQL DBE works in a way which marks the data requested for deletion as kind of garbage, and this data is in fact not deleted. During such operations, dead rows remain in the tables and the amount of disk space occupied grows. To avoid this, it is necessary to properly perform the removal of such rows.

The only way to delete the data marked for deletion, and therefore reclaim the wasted space is vacuum procedure which needs to be executed on a regular basis as Allure TestOps deletes the data (again) on a regular basis.

Diagnostics:

Execute:

select
    schemaname,
    relname,
    n_live_tup,
    n_dead_tup,
  last_vacuum,
    last_autovacuum,
    last_analyze,
    last_autoanalyze
 from pg_stat_all_tables
where schemaname = 'public'
order by n_dead_tup desc

n_dead_tup - amount of dead rows
n_live_tup - amount of live/actually filled rows

If in response you are getting lots of dead rows than you will need to perform VACUUM operation:

Option 1. Autovacuum

Execute:

select name, setting
  from pg_settings
 where name in ('autovacuum_analyze_threshold', 'autovacuum_analyze_scale_factor');

Calculate from what amounts of dead rows will the autovacuum perform.

Formula for determining the need for analyse:

autovacuum_analyze_threshold + (autovacuum_analyze_scale_factor *the amount of rows in the table)

For example:

In the table test_fixture_result_children there are 13834619 record plus 619398 dead rows.

The last execution of autovacuum was 2024-05-09 08:27:32.604604+00

The values of the parameters are

autovacuum_analyze_threshold = 0.05, autovacuum_analyze_threshold = 50

Calculations:

50 + (0.05 * 13833401) = 691720 (5%)

This means that autovacuum will run after 72,322 rows have been changed.

Option 2. Manual vacuum

vacuum <tablename>;

Options to be used for VACUUM and frequency of the procedure need to be decided by your personnel responsible for the databases maintenance considering the peculiarities of VACUUM procedure described in the article mentioned above (exclusive locks etc.).