Saturday 28 February 2015

Ordered Parallel Processing in C#

In one of the projects I've been working on, we need process a bunch of files containing invoice data. Processing these can be time consuming, as the files can be quite large and although the usage given to this data seems to suggest that it can be done overnight, the business has insisted in processing the files during the online day, at 17:00.

The problem is that that the files tend to contain the invoice journey through the various states and for audit purposes we need to process them all.

So, for instance if the first record on a file is an on hold invoice, we want to process this, but we also want to process the same invoice showing as paid further down the file. We can't just process the paid event. Furthermore, we also want the invoice record to end with a status of paid, which is fairly reasonable.

The problem is that if we process the invoices in parallel, we have no guarantees that they will be processed in the right order, so a paid invoice record might end up with a state of issued, which is not great, so we just went for the quick and easy solution and thus processed the files serially.

I gave the matter a little bit more thought and came up with this:

private void UpdateInvoices(IEnumerable<IInvoice> invoices)
{
    var groupedInvoices = invoices.GroupBy(x => x.Status)
        .OrderBy(x => x.Key)
        .Select(y => y.Select(x => x));

    foreach (var invoiceGroup in groupedInvoices)
    {
        Parallel.ForEach(invoiceGroup, po, (invoice) =>
        {
           UpdateInvoice(invoice);
        });
    }
}
What we do is, we group all the invoices by status and order them by status. We then process all of the invoices in a status group in parallel, so that all invoices with status issued, get processed first, and paid last, a few more get processed in between.

It is, of course, possible to have multiple parallel for each loops for each status, but I feel that this solution is more elegant and easier to maintain.

PLinq does have an AsOrdered method, but the UpdateInvoice method doesn't return anything, if it fails to update the database, it simple logs it and it's for the server boys and girls to worry about.

Furthermore, it simply doesn't quite work as I might have expected it to work.

The code from this sample has been modified to better simulate what we're trying to achieve:

var source = Enumerable.Range(9, 50);

var parallelQuery = source.AsParallel().AsOrdered()
    .Where(x => x % 3 == 0)
    .Select(x => { System.Diagnostics.Debug.WriteLine("{0} ", x); return x; });

// Use foreach to preserve order at execution time. 
foreach (var v in parallelQuery)
{
    System.Diagnostics.Debug.WriteLine("Project");
    break;
}

// Some operators expect an ordered source sequence. 
var source = Enumerable.Range(9, 30);

var parallelQuery = source.AsParallel().AsOrdered()
    .Where(x => x % 3 == 0)
    .Select(x => { System.Diagnostics.Debug.WriteLine("{0} ", x); return x; });

// Use foreach to preserve order at execution time. 
foreach (var v in parallelQuery)
{
    System.Diagnostics.Debug.WriteLine("Project");
    break;
}

// Some operators expect an ordered source sequence. 
var lowValues = parallelQuery.Take(10);

int counter = 0;
foreach (var v in lowValues)
{
    System.Diagnostics.Debug.WriteLine("{0}-{1}", counter, v);
    counter++;
}
The call to Debug.WriteLine is the same as UpdateInvoice in the code above, in the sense that they are both void methods that cause side effects.

This is what the above prints:
9 15 18 12 30 33 36 21 24 27
Project
9 15 18 12 30 21 36 27 24 33 
0-9 1-12 2-15 3-18 4-21 5-24 6-27 7-30 8-33 9-36 


As you can see the end result is ordered but the getting there isn't, and the getting there is what we're interested in, which is why we could not use PLinq.


Saturday 14 February 2015

Gas and Electricity Consumption in a 1920s mid-terrace house in the North of England.

Last week I was going through some old pen drives to see if there was actually anything worth keeping and I found a lot of old energy consumption measurements I took back at our old house, so I thought I would share them here.

The house was a small mid terrace house, with central heating and a gas cooker, built after the First World War. I started taking the measurements after I decided that leaving my gaming PC on 24/7 wasn't a great idea, I should've taken a few measurements with it on, but there you go. We only heated the house to a relatively low temperature, i.e. ~ 18° C

Unfortunately, I don't have measurements of outside temperature so I cannot correlate energy use to outside temperature, but the data was gathered to try to get a better understanding of how much gas and electricity we were using at the time.

Without further ado here are the charts:

It's hard to see electricity consumption in the above chart, so here it is:

Estimate costs below. I will not rant about the rather ludicrous way Gas and Electricity is priced in this country.



Electricity on its own again:





Wednesday 11 February 2015

Brain Dump 7 - Remove User from group in SharePoint

Clue is in the title

In essence below is a method that will remove a user from a group in SharePoint.

User can be of the form domain\user or user@domain


public bool RemoveUserFromSharePointGroup(string userName, string groupName)
{
 var principal = Microsoft.SharePoint.Client.Utilities.Utility.ResolvePrincipal(context, context.Web, userName,
  Microsoft.SharePoint.Client.Utilities.PrincipalType.User, Microsoft.SharePoint.Client.Utilities.PrincipalSource.All,
  context.Web.SiteUsers, false);
  
 context.ExecuteQuery();
 
 if (principal.Value != null)
 {
  string login = principal.Value.LoginName;
  GroupCollection siteGroups = context.Web.SiteGroups;
  Group group = siteGroups.GetByName(groupName);
 
  var query = context.LoadQuery(group.Users.Where(usr => usr.LoginName == login).Include(u => u.LoginName));
 
  context.ExecuteQuery();
 
  User user = query.SingleOrDefault();
 
  if (user != null)
  {
   group.Users.RemoveByLoginName(user.LoginName);
  }
 
  context.ExecuteQuery();
 
 }
}

Tuesday 10 February 2015

Brain Dump 6 - Allow requests of any length in IIS

The same request from two different browsers to a custom WCF service.

First in Firefox:

http://devbox.dev.com:8732/Mock.svc/Mock/GetPartNumber?data=N^99ac52cd-142b-4b84-8b8e-849e320ee8cc^GetPartNumber^%3Ccontent%3E%3CGetPartNumber%3E%3Cid%3E99ac52cd-142b-4b84-8b8e-849e320ee8cc%3C/id%3E%3CsupplierDetails%3E%3CsupplierNameLine1%3EsupplierNameLine1%3C/supplierNameLine1%3E%3CsupplierNameLine2%3EsupplierNameLine2%3C/supplierNameLine2%3E%3CsupplierAddressLine1%3EsupplierAddressLine1%3C/supplierAddressLine1%3E%3CsupplierAddressLine2%3EsupplierAddressLine2%3C/supplierAddressLine2%3E%3CsupplierAddressLine3%3EsupplierAddressLine3%3C/supplierAddressLine3%3E%3CsupplierAddressLine4%3EsupplierAddressLine4%3C/supplierAddressLine4%3E%3CsupplierTownOrCity%3EsupplierTownOrCity%3C/supplierTownOrCity%3E%3CsupplierCounty%3EsupplierCounty%3C/supplierCounty%3E%3CsupplierCountry%3EsupplierCountry%3C/supplierCountry%3E%3CsupplierPostCode%3EsupplierPostCode%3C/supplierPostCode%3E%3C/supplierDetails%3E%3CcustomerDetails%3E%3CcustomerName%3EcustomerName%3C/customerName%3E%3CaddressLine1%3EaddressLine1%3C/addressLine1%3E%3CaddressLine2%3EaddressLine2%3C/addressLine2%3E%3CaddressLine3%3EaddressLine3%3C/addressLine3%3E%3CaddressLine4%3EaddressLine4%3C/addressLine4%3E%3CtownOrCity%3EtownOrCity%3C/townOrCity%3E%3Ccounty%3Ecounty%3C/county%3E%3Ccountry%3EUnited%20Kingdom%3C/country%3E%3CpostCode%3EpostCode%3C/postCode%3E%3C/customerDetails%3E%3CvatNumber%3EGB12345%3C/vatNumber%3E%3CdocumentNumberPrefix%3ESIA%3C/documentNumberPrefix%3E%3CdocumentNumber%3E1%3C/documentNumber%3E%3CtransactionNumber%3E1%3C/transactionNumber%3E%3CdateDocumentRaised%3E2014-09-19%3C/dateDocumentRaised%3E%3CdescriptionOfItemSold%3EdescriptionOfItemSold%3C/descriptionOfItemSold%3E%3CquantitySold%3E1%3C/quantitySold%3E%3CitemCostNet%3E80.00%3C/itemCostNet%3E%3CtotalNetCostOfItems%3E80.00%3C/totalNetCostOfItems%3E%3CnetTotal%3E80.00%3C/netTotal%3E%3CnetDiscount%3E0.00%3C/netDiscount%3E%3CvatRate%3E25.00%3C/vatRate%3E%3CvatAmount%3E20.00%3C/vatAmount%3E%3CgrossTotal%3E100.00%3C/grossTotal%3E%3CformatDocumentNumber%3ESIA000000001%3C/formatDocumentNumber%3E%3CgenesesData%3E%3CbookingReference%3E79bbec92-5bca-44d2-8e61-bde366a0379b%3C/bookingReference%3E%3C/genesesData%3E%3C/GetPartNumber%3E%3C/content%3E

This is approximately 2193 characters and thus bytes, assuming ascii encoding

An now in IE:

http://devbox.dev.com:8732/Mock.svc/Mock/GetPartNumber?data=N^99ac52cd-142b-4b84-8b8e-849e320ee8cc^GetPartNumber^<content><GetPartNumber><id>99ac52cd-142b-4b84-8b8e-849e320ee8cc</id><supplierDetails><supplierNameLine1>supplierNameLine1</supplierNameLine1><supplierNameLine2>supplierNameLine2</supplierNameLine2><supplierAddressLine1>supplierAddressLine1</supplierAddressLine1><supplierAddressLine2>supplierAddressLine2</supplierAddressLine2><supplierAddressLine3>supplierAddressLine3</supplierAddressLine3><supplierAddressLine4>supplierAddressLine4</supplierAddressLine4><supplierTownOrCity>supplierTownOrCity</supplierTownOrCity><supplierCounty>supplierCounty</supplierCounty><supplierCountry>supplierCountry</supplierCountry><supplierPostCode>supplierPostCode</supplierPostCode></supplierDetails><customerDetails><customerName>customerName</customerName><addressLine1>addressLine1</addressLine1><addressLine2>addressLine2</addressLine2><addressLine3>addressLine3</addressLine3><addressLine4>addressLine4</addressLine4><townOrCity>townOrCity</townOrCity><county>county</county><country>United Kingdom</country><postCode>postCode</postCode></customerDetails><vatNumber>GB12345</vatNumber><documentNumberPrefix>SIA</documentNumberPrefix><documentNumber>1</documentNumber><transactionNumber>1</transactionNumber><dateDocumentRaised>2014-09-19</dateDocumentRaised><descriptionOfItemSold>descriptionOfItemSold</descriptionOfItemSold><quantitySold>1</quantitySold><itemCostNet>80.00</itemCostNet><totalNetCostOfItems>80.00</totalNetCostOfItems><netTotal>80.00</netTotal><netDiscount>0.00</netDiscount><vatRate>25.00</vatRate><vatAmount>20.00</vatAmount><grossTotal>100.00</grossTotal><formatDocumentNumber>SIA000000001</formatDocumentNumber><genesesData><bookingReference>79bbec92-5bca-44d2-8e61-bde366a0379b</bookingReference></genesesData></GetPartNumber></content>

This is approximately 1863 characters and thus bytes, assuming ascii encoding

This means that the first request makes IIS choke and the second one works fine, as it's below the 2 KB limit

There is a relatively simple solution. Modify the web.config of the WCF service, where length is the number of bytes:

<system.webServer>
  <security>
    <requestFiltering>
      <requestLimits maxQueryString="length"/>
    </requestFiltering>
  </security>
</system.webServer>