Close

Asynchronuous Processing

Asynchronuous processing is a complex construct, that I don’t use every day. For that reason, I will try and clarify the steps.

First, let’s look at the signature of the asynchronuous method:
private static async Task<bool> StoreMetadataAsync(ILogger Logger)
private static async Task DoAsync(Page<BlobItem> blobPage, CloudTable table, int page)

The Async suffix and the async indicator marks the method as an asynchronuous method. Return values are passed as Task<T> where Task is equivalent to void.

The next step is that we call the asynchronuous method. This way we start the method, but it’s not a blocking call. We don’t wait for an answer, we simply continue with the next statement
in the calling method or we shift focus back to the user interface. Call:
Task<bool> storeMetadataTask = StoreMetadataAsync(logger);

After the asynchronuous call, we can perform other functions in the calling method or user interface. Only when using the await statement execution will hold until the awaited asynchronous operation is complete.
bool storeMetaDataFinished = await StoreMetadataTask

If there’s no intermediate work to do, it makes no sense to do an asynchronuous call. Yes, we could perform the following call, but this is actually an synchronuous call of an asynchronous method:
bool storeMetaDataFinished = await StoreMetadataAsync(logger);

Let’s make it one step more complicated. What if we want to use await on multiple tasks and not for each task separately. I actually found two posts on await Task.WhenAll.
The first example executes two known tasks. By executing the tasks asynchronuously and using await Task.WhenAll instead of an await per task, we can significantly lower through-put time.
Note how through-put time is measured via the Stopwatch class.
Await Known Tasks

In a second example, we use a ForEach construct with an unknown number of tasks. The tasks are executed asynchronuously and added to a list. As a final statement we await all tasks in the list.
Await Unknown Tasks

I wrote a simple console application that you can use to test the difference between a synchronuous call and an asynchronuous call. I performed 100 times 100 table inserts., so in total 10.000 table inserts. The difference in through-put time is substantial: sync 02:44, async 00:14 seconds.

using Microsoft.Azure.Cosmos.Table;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading.Tasks;

namespace TestAsync
{
    class MetaDataEntity : TableEntity
    {

        public MetaDataEntity(string session, string modifiedDate)
        {
            this.PartitionKey = session; this.RowKey = modifiedDate;
        }

        public MetaDataEntity() { }

        //public string Email { get; set; }

    }

    class Program
    {
        static async Task Main(string[] args)
        {

            await ParentMethodAsync();

        }

        private static async Task ParentMethodAsync()
        {

            var stopwatch = new Stopwatch();
            string connectionString = "...";

            List<Task> listOfTasks = new List<Task>();

            // Create a CloudTableClient
            CloudStorageAccount account = 
              CloudStorageAccount.Parse(connectionString);
            CloudTableClient tableClient = 
              account.CreateCloudTableClient();
            CloudTable table = tableClient.GetTableReference("MetaData");

            // Synchronuous Call
            stopwatch.Start();
            Console.WriteLine("Synchronuous Call");

            for (int i = 0; i < 100; i++)
            {
                Task forEachTask = DoAsync(table, i);
                await forEachTask;
            }

            stopwatch.Stop();
            Console.WriteLine(stopwatch.Elapsed);

            // Asynchronuous Call
            stopwatch.Reset();
            stopwatch.Start();
            Console.WriteLine("Asynchronuous Call");

            for (int i = 0; i < 100; i++)
            {
                Task forEachTask = DoAsync(table, i);
                listOfTasks.Add(forEachTask);
            }

            await Task.WhenAll(listOfTasks);

            stopwatch.Stop();
            Console.WriteLine(stopwatch.Elapsed);

        }

        private static async Task DoAsync(CloudTable table, int run)
        {

            MetaDataEntity metaDataEntity = new MetaDataEntity();

            for (int i = run * 100; i < (run+1)*100; i++)
            {
                metaDataEntity.PartitionKey = $"session{i.ToString()}";
                metaDataEntity.RowKey = $"device{i.ToString()}";
                TableOperation insertOperation = 
                TableOperation.InsertOrMerge(metaDataEntity);
                await table.ExecuteAsync(insertOperation);
            }
                  
            metaDataEntity = null;

        }

    }
}

Related to the concept of asynchronuous execution are the concepts of concurrency and parallelism.

Async execution allows a single thread to start one process and then do something else instead of waiting for the first process to finish.
Concurrency – or thread-based concurrency – actually divides up the work over multiple threads, which makes it faster.
Parallelism – or CPU-based concurrency – divides up the work over multiple threads running on multiple processors or processor cores.

For thread-based concurrency Microsoft uses the Task Parallel Library, notably Parallel.ForEach. The threads can execute on multiple processors or CPU cores, but that’s where it gets a bit vague. Parallel LINQ also uses thread-based concurrency. I have added the following code to the above example. The code now executes in about 00:17 seconds. That’s slower than expected.

            // Parallel
            int[] runs = new int[100];
            for (int i = 0; i < 100; i++)
            {
                runs[i] = i;
            }

            stopwatch.Reset();
            stopwatch.Start();
            Console.WriteLine("Parallel");

            Parallel.ForEach(runs, async i =>
            {

                Task forEachTask = DoAsync(table, i);
                listOfTasks.Add(forEachTask);

            });

            await Task.WhenAll(listOfTasks);

            stopwatch.Stop();
            Console.WriteLine(stopwatch.Elapsed);

When I replace the asynchronuous call for a synchronuous call, it obviously works slower: 00:38 seconds. I though maybe it was a problem to mix multi-threading and asynchronuous execution.

            // Parallel
            int[] runs = new int[100];
            for (int i = 0; i < 100; i++)
            {
                runs[i] = i;
            }

            stopwatch.Reset();
            stopwatch.Start();
            Console.WriteLine("Parallel");

            Parallel.ForEach(runs, i =>
            {
                
                DoSync(table, i);
                
            });

            stopwatch.Stop();
            Console.WriteLine(stopwatch.Elapsed);

        private static void DoSync(CloudTable table, int run)
        {

            MetaDataEntity metaDataEntity = new MetaDataEntity();

            for (int i = run * 100; i < (run + 1) * 100; i++)
            {
                metaDataEntity.PartitionKey = $"session{i.ToString()}";
                metaDataEntity.RowKey = $"device{i.ToString()}";
                TableOperation insertOperation = 
                TableOperation.InsertOrMerge(metaDataEntity);
                table.Execute(insertOperation);
            }

            metaDataEntity = null;

        }