Skip to content

Speedup count of records #1182

@lvca

Description

@lvca

Requirements

ArcadeDB could be slow to count the records in a bucket or a type because it doesn't keep track of the number of records stored. This means when you execute a count the entire buckets are scanned.

The main reason why we don't save metadata about the number of records is that counting is considered a rare operation, mainly for a DBA, but rarely used at run-time.

There are some use cases where having a fast count could be useful, such as:

  • Sequences: ArcadeDB doesn't support sequences, but it'd be easy to create a sequence-like script if the count would be immediate. Example: insert into X set id = (select count(*) + 1 as newId from X)[0].newId
  • Studio: nice to have a count column next to the type and a total of records in the database

Implementation

The easiest and fastest implementation on top of my mind is to save the count in a new JSON file under the database directory with name database_count.json like this:

{
  "Invoice_238278273": 100000,
  "Invoice_238278999": 30000
}

This file is saved at the database shutdown and loaded at startup. After loaded, the file must be deleted, so in case the database hasn't properly closed, the actual count with scanning must be done.

In RAM the EmbeddedDatabase instance will have a ConcurrentHashMap<String,Long> containing a map of bucket names (strings) with the counter of records (longs). This map will be updated only at transaction commit inside the exclusive lock to prevent concurrent updates.

Pseudo algorithm

public long Bucket.count() {
  Long cachedCount = database.getBucketCount( name );
  if( cachedCount != null )
    return cachedCountl;

  // SCAN THE BUCKET AND COUNT THE RECORDS
  long total = countRecords();

  database.updateBucketCount( name, total );
  return total;
}

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions