A few weeks ago I started my first foray into multi threading. The idea was to speed up a housekeeping service by using several threads (all it does at the moment is call, or consume, a web service that deletes stuff from the database) and since we have four servers, I thought I could speed it up by forking four threads each one going to an application server.
All seemed to be going well until I actually tested properly, i.e. deleting hundreds of entities rather than just a few. Then something bizarre was happening. I kept getting an indexoutofrange exception on line 25, which I could not understand as the threadNumber is not anywhere else in the code (only relevant method shown below)
1 public static void RunBatch(string entityName) 2 { 3 try 4 { 5 List<string> entityIds = GetEntityListToBeDeleted(entityName); 6 7 if (entityIds.Count > 0) 8 { 9 //number of entities that each thread will delete. 10 int entitiesperthread = entityIds.Count / nofThreads; 11 12 for (int threadNumber = 0; threadNumber < nofThreads; threadNumber++) 13 { 14 MRE[threadNumber] = new ManualResetEvent(false); 15 16 myService = InstantiateService(threadNumber); 17 18 if (threadNumber != nofThreads - 1) 19 { 20 21 threads[threadNumber] = new Thread(() => DeleteMethod(threadNumber * entitiesperthread, (threadNumber + 1) * entitiesperthread - 1, MRE[threadNumber], threadName, myService, entityIds, entityName)); 22 } 23 else 24 { 25 threads[threadNumber] = new Thread(() => DeleteMethod(threadNumber * entitiesperthread, entityIds.Count, MRE[threadNumber], threadName, myService, entityIds, entityName)); 26 } 27 28 threads[threadNumber].Start(); 29 30 } 31 32 WaitHandle.WaitAll(MRE); 33 34 } 35 } 36 catch (Exception ex) 37 { 38 //TODO: Log it. 39 } 40 41 }
It turns out that when you pass a value to an anonymous method, as I do to tell the thread what to do, the values are passed by reference, which means that the for loop continues running and will set threadNumber to 4, in my case, this happens very quickly, in fact quicker that the thread is actually started, so that we get the indexoutofrange exception as by the time the thread has started and gets a chance to call DeleteMethod, the for loop has already completed and set threadNumber to 4. A simple change gets us over the hurdle:
1 public static void RunBatch(string entityName) 2 { 3 try 4 { 5 6 List<string> entityIds = GetEntityListToBeDeleted(entityName); 7 8 if (entityIds.Count > 0) 9 { 10 //number of entities that each thread will delete. 11 int entitiesperthread = entityIds.Count / nofThreads; 12 13 for (int i = 0; i < nofThreads; i++) 14 { 15 int threadNumber = i; 16 17 MRE[threadNumber] = new ManualResetEvent(false); 18 19 myService = InstantiateService(threadNumber); 20 21 if (threadNumber != nofThreads - 1) 22 { 23 24 threads[threadNumber] = new Thread(() => DeleteMethod(threadNumber * entitiesperthread, (threadNumber + 1) * entitiesperthread - 1, MRE[threadNumber], threadName, myService, entityIds, entityName)); 25 } 26 else 27 { 28 threads[threadNumber] = new Thread(() => DeleteMethod(threadNumber * entitiesperthread, entityIds.Count, MRE[threadNumber], threadName, myService, entityIds, entityName)); 29 } 30 31 threads[threadNumber].Start(); 32 33 } 34 35 WaitHandle.WaitAll(MRE); 36 37 } 38 } 39 catch (Exception ex) 40 { 41 //TODO: Log it. 42 } 43 44 }
Now the for loop will finish, but because we are using a copy of the loop variable, rather than the variable itself, threadNumber will never be set to 4 and thus it will run fine.
What improvements can we expect by fully utilizing our app servers?
Well, we've gone from 6 deletions per minute for a single thread to 27 deletions per minute for 4 threads (1 per server). As an experiment I set two threads per server and we got an improvement but only to 33 deletions per minute. I've not tried running three threads per server, but I can't see that we'll get a massive improvement out of that.
No comments:
Post a Comment