1

I have a piece of code deployed in Azure Container Apps that primarily copies the file from the staging Azure blob storage to the final Azure blob storage and computes the SHA256 hash.

However, I have noticed the Azure container app often crashes with cpu reaching 100+% while computing the hash for large files (5GB, 10GB, ... 25GB).

I am using the following code for computing the hash.

private async Task<string> ComputHashAsync(BlobClient blobClient)
{
    using (var sha256 = SHA256.Create())
    {
        using (var blobStream = await blobClient.OpenReadAsync())
        {
            using (var bufferedStream = new BufferedStream(blobStream, this._computeHashBufferSize))
            {
                var hash = sha256.ComputeHash(bufferedStream);
                return BitConverter.ToString(hash).Replace("-", "").ToLowerInvariant();
            }
        }
    }
}

with the _computeHashBufferSize set to 2097152.

Any suggestions to improve the above code to safeguard from causing the container app crash with higher CPU usage?

6
  • Do u have logs? It should tell u there why is crashing... Commented Sep 2 at 11:15
  • Check memory usage in Task Manager. Commented Sep 2 at 12:25
  • Thanks for the messages. The logs doesn't say much, it just shows manually terminated. But, from the monitoring I could see that the CPU got over 100% before every time it crashed. Memory was sub 30% while crashing. Commented Sep 2 at 13:05
  • May be the approach from this answer works better: stackoverflow.com/a/3621316/150978 Commented Sep 2 at 14:38
  • Not sure... I would first try to debug and see if i can optimize this somehow. But if this is the only solution and your app must handle 25GB files then... you have to pay for more CPU in your app service plan. Or consider other hosting options - run this code as Azure Function maybe? Or try AWS? Commented Sep 2 at 16:19

1 Answer 1

0

I faced a somewhat similar issue with WebJobs in Azure. I changed WebJobs to function app. Also, try to optimize the code with parallel processing and memory limits

private async Task<string> ComputeHashAsync(BlobClient blobClient)
{
    const int maxMemoryMB = 512; // Limit memory usage
    const int chunkSize = 1024 * 1024; // 1MB chunks
    
    using (var sha256 = SHA256.Create())
    {
        using (var blobStream = await blobClient.OpenReadAsync())
        {
            byte[] buffer = new byte[chunkSize];
            int bytesRead;
            
            while ((bytesRead = await blobStream.ReadAsync(buffer, 0, chunkSize)) > 0)
            {
                // Check memory usage
                if (GC.GetTotalMemory(false) > maxMemoryMB * 1024 * 1024)
                {
                    GC.Collect();
                    GC.WaitForPendingFinalizers();
                }
                
                sha256.TransformBlock(buffer, 0, bytesRead, null, 0);
            }
            
            sha256.TransformFinalBlock(buffer, 0, 0);
            return BitConverter.ToString(sha256.Hash).Replace("-", "").ToLowerInvariant();
        }
    }
}
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks @Techiemanu I will try this approach first.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.