Speedup count of records

Requirements

ArcadeDB could be slow to count the records in a bucket or a type because it doesn't keep track of the number of records stored. This means when you execute a count the entire buckets are scanned.

The main reason why we don't save metadata about the number of records is that counting is considered a rare operation, mainly for a DBA, but rarely used at run-time.

There are some use cases where having a fast count could be useful, such as:

Sequences: ArcadeDB doesn't support sequences, but it'd be easy to create a sequence-like script if the count would be immediate. Example: insert into X set id = (select count(*) + 1 as newId from X)[0].newId
Studio: nice to have a count column next to the type and a total of records in the database

Implementation

The easiest and fastest implementation on top of my mind is to save the count in a new JSON file under the database directory with name database_count.json like this:

{
  "Invoice_238278273": 100000,
  "Invoice_238278999": 30000
}

This file is saved at the database shutdown and loaded at startup. After loaded, the file must be deleted, so in case the database hasn't properly closed, the actual count with scanning must be done.

In RAM the EmbeddedDatabase instance will have a ConcurrentHashMap<String,Long> containing a map of bucket names (strings) with the counter of records (longs). This map will be updated only at transaction commit inside the exclusive lock to prevent concurrent updates.

Pseudo algorithm

public long Bucket.count() {
  Long cachedCount = database.getBucketCount( name );
  if( cachedCount != null )
    return cachedCountl;

  // SCAN THE BUCKET AND COUNT THE RECORDS
  long total = countRecords();

  database.updateBucketCount( name, total );
  return total;
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speedup count of records #1182

Requirements

Implementation

Pseudo algorithm

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Speedup count of records #1182

Description

Requirements

Implementation

Pseudo algorithm

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions