Using > Cluster Operations:
TicTac AAE Folds

Since Riak KV 2.9.1, the new AAE system, TicTac AAE, has added several useful functions that make performing keylisting and tombstone management tasks quicker and more efficient by using TicTacAAE’s Merkle trees instead of iterating over the keys in a bucket.

These functions stabilised in Riak KV 2.9.4, and so are not recommended before that version.

Configuration settings in riak.conf

For more TicTac AAE configuration settings, please see the TicTac AAE configuration settings documentation.

TicTacAAE

Turn on TicTacAAE. It works independantly of the legacy AAE system, so can be run in parallel or without the legacy system.

tictacaae_active = active

Note that this will use up more memory and disk space as more metadata is being stored.

Storeheads

Turn on TicTacAAE storeheads. This will ensure that TicTacAAE will store more information about each key, including the size, modified date, and tombstone status. Without setting this to true, the aae_fold functions on this page will not work as expected.

tictacaae_storeheads = enabled

Note that this will use up more memory and disk space as more metadata is being stored.

Tuning

You can increase the number of simultaneous workers by changing the af4_worker_pool_size value in riak.conf. The default is 1 per node.

af4_worker_pool_size = 1

General usage

Use Riak attach to run these commands.

The general format for calling aae_fold is:

riak_client:aae_fold(
    query,
    Client).

query is a tuple describing the function to run and the parameters to use. The first value in the tuple is always the function name. For example, if calling the list_buckets function the tuple would look like {list_buckets, ...}. The number of values in the tuple depends on the function being called.

As an example, this will call list_buckets, which takes a single parameter:

riak_client:aae_fold({
    list_buckets,
    3
    }, Client).

The Riak Client

For these calls to work, you will need a Riak client. This will create one in a reusable variable called Client:

{ok, Client} = riak:local_client().

Client can now be used for the rest of the riak attach session.

Troubleshooting - timeouts

The calls to aae_fold are synchronous calls with a 1 hour timeout, but they start an asynchronous process in the background.

If your command takes longer than 1 hour, then you will get {error,timeout} as a response after 1 hour. Note that the requested command continues to run in the background, so re-calling the same method will take up more resources.

To timeout you typically have to have a very large number of keys in the bucket.

How to check if finished after a timeout

After experiencing a timeout, the current number of commands waiting to execute can be checked by asking for the size of the assured forwarding pool af4_pool. Once it reaches 0, there are no more workers as all commands have finished. The size of the pool can checked using this command:

{_, _, _, [_, _, _, _, [_, _, {data, [{"StateData", {state, _, _, MM, _, _}}]}]]} =
    sys:get_status(af4_pool),
io:format("af4_pool has ~b workers\n", [length(MM)]),
f().
Warning: existing variables cleared

f() will unbind any existing variables, which may not be your intention. If you remove f() then please remember that MM will remain bound to the first value. For re-use, you should change the variable name or restart the riak attach session.

How to avoid timeouts

To reduce the chance of getting a timeout, reduce the number of keys checked by using the bucket and key range filters.

The modified filter will not reduce the number of keys checked, and only acts as a filter on the result.

Filters

Please see the TicTac AAE Filters documentation.

These filters are used by several functions:

These filters can only be used with the find_keys function:

Find keys

Function: find_keys

Returns a list of keys that meet the filter parameters.

Learn More >>

Find Riak tombstones

Function: find_tombs

Returns tuples of bucket name, keyname, and object size of Riak tombstone objects that meet the filter parameters.

Learn More >>

List Buckets

Function: list_buckets

Returns a list of all buckets.

Learn More >>

Count keys

Function: erase_keys with count

Counts the Riak keys that meet the filter parameters.

Learn More >>

Count tombstones

Function: reap_tombs with count

Counts the Riak tombstone objects that meet the filter parameters.

Learn More >>

Get object statistics

Function: object_stats

Returns a count of Riak objects that meet the filter parameters.

Learn More >>

Erase keys

Function: erase_keys with local

Deletes Riak keys that meet the filter parameters.

Learn More >>

Reap tombstones

Function: reap_tombs with local

Reaps the Riak tombstone objects that meet the filter parameters.

Learn More >>

Other functions not covered

aae_fold has various other functions that can be called, but are mostly for internal use by Riak. These functions should not be used without a good understanding of the source code, but are provided here for reference:

  • fetch_clocks_nval
  • fetch_clocks_range
  • merge_branch_nval
  • merge_root_nval
  • merge_tree_range
  • repair_keys_range
  • repl_keys_range