Monday 26 October 2015

Adventures using Availability Groups and RBS with SharePoint 2013

The concept behind a remote blob storage is pretty simple, see this for instance. I just want to talk about the myriad issues we've had when using RBS. with availability groups.

Our database setup uses Availability Groups, which, and this is controversial, is a cheap cluster. I do get that there are advantages to availability groups but these seem to be outweighed by the disadvantages. I know this is just my opinion and that I also know nothing about availability groups, HA clusters or anything in general, thank you for pointing out.

So what are the drawbacks of AGs?

  • Official Support is patchy. e.g. in Dynamics CRM 2013 one is forced to update the database directly.
  • Performance can be an issue as the database is always using at least two boxes.
  • Stuff flat out refuses to work, e.g. RBS Maintainer, various SharePoint database related operations.
To the AG mix we introduced RBS and this is were things started to go horribly wrong for us.

The first issue we encountered was the inability to delete a content database from SharePoint, which is not a major issue but it's really annoying.

The second issue was that the RBS maintianer would not work, so the storage requirements would just keep growing. This might not be an issue if you don't plan to archive your documents, but our DB had ~500GB of docs, about 2/3 of which were old but for contractual reasons needed to be kept.

This effectively put a nail in the coffin of the RBS + AG combo but there is more.

In order to load the ~500 GB document, we had a tool to upload the documents to SharePoint. This tool was multi-threaded and it essentially read the documents from the source DB and uploaded them to SharePoint, using the SharePoint CSOM model.

At this point, it's worth mentioning that our hosting provider does not guarantee any sort of performance level, too long to explain.

A couple of weeks back, with RBS on the database, we did a trial run of the upload and we were hitting very poor rates, ~ 4 GB per hour.

Last week, after RBS had been disabled and the content databases recreated, we tried a second trial run and the speed jumped to ~ 20 GB per hour.

I can't say that our RBS configuration was perfect, I think the threshold was on the low side (128 KB) but even so, the speed increase has been massive.

It actually gets better, because the 4 GB per hour figure was using both servers in the farm, whereas the 20 GB per hour figure was simply using one.

yes, yes, I know our hosting provider is crap and 128 KB is below the recommendation, but a 5 fold increase in transfer rates and a lowering of the error rate to almost zero is something that should be considered.

Sunday 4 October 2015

Integrating MS SharePoint with MS Dynamics CRM 2011/2013 - User Permissions

One of the things that seems to come up time and again on any integration of MS Dynamics CRM and SharePoint is the issue of user permissions.

Generally speaking it would be nice to be able to control access to SharePoint based upon the permissions the user has in MS Dynamics CRM. Alas, this is not possible without writing a bit code. (I've not investigated the Server to Server Integration yet as it seems to be available for online Dynamics CRM only)

The way I have done this, is by using an HttpModule deployed to the Ms SharePoint servers to check whether the user making the request to the MS SharePoint site has actually got access to the record in MS Dynamics CRM itself.

In our case this is pretty straight forward as we only store documents for a single entity, but there is nothing in principle to rule out an expansion to multiple entities.

Depending on the usage, caching will need to be a serious consideration, as performance could be impacted, but I have not thought about it too much yet.

The following assumptions have been made about the integration between MS Dynamics CRM and MS SharePoint:
  1. A document library exists and is named with the entity schema.
  2. Each entity record in MS Dynamics CRM has a single folder in MS SharePoint and this folder is named with the GUID of the record. 
  3. Entity Records in MS Dynamics CRM are not shared.

This is the code for the module itself:

using log4net;
using Microsoft.Crm.Sdk.Messages;
using Microsoft.Xrm.Sdk.Client;
using Microsoft.Xrm.Sdk.Query;
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Configuration;
using System.Linq;
using System.Net;
using System.Security.Principal;
using System.ServiceModel.Description;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;
using System.Web;

namespace CRMUserPermissions
{
    public class CRMUserPermissions : IHttpModule
    {

        public static ConcurrentDictionary<string, Guid> userIds = new ConcurrentDictionary<string, Guid>();

        const string GuidPattern = @"(/|%2f)([A-F0-9]{8}(?:-[A-F0-9]{4}){3}-[A-F0-9]{12})";

        const string UserIdQuery = @"<fetch version='1.0' output-format='xml-platform' mapping='logical' distinct='false'>
  <entity name='systemuser'>   
    <attribute name='systemuserid' />
    <attribute name='domainname' />
    <filter type='and'>
      <condition attribute='domainname' operator='eq' value='{0}' />
    </filter> 
  </entity>
</fetch>";

        public void Dispose()
        {
        }

        public void Init(HttpApplication context)
        {
            context.PostAuthenticateRequest += new EventHandler(context_PostAuthenticateRequest);
        }

        void context_PostAuthenticateRequest(object sender, EventArgs e)
        {
            HttpApplication app = sender as HttpApplication;
            HttpContext context = app.Context;

            if (IsRequestRelevant(context))
            {
                try
                {

                    string user = HttpContext.Current.User.Identity.Name.Split('|').Last();

                    var service = CrmService.GetService();

                    string url = app.Request.Url.ToString();

                    if (!userIds.ContainsKey(user))
                    {
                        string query = string.Format(UserIdQuery, user);
                        var userId = service.RetrieveMultiple(new FetchExpression(query)).Entities.SingleOrDefault();
                        userIds.TryAdd(user, userId.Id);
                    }

                    var record = GetRecordInfo(url);

                    RetrievePrincipalAccessRequest princip = new RetrievePrincipalAccessRequest();
                    princip.Principal = new EntityReference("systemuser", userIds[user]);

                    princip.Target = new EntityReference(record.Item1, record.Item2);

                    var res = (RetrievePrincipalAccessResponse)service.Execute(princip);

                    if (res.AccessRights == AccessRights.None)
                    {
                        app.Response.StatusCode = 403;
                        app.Response.SubStatusCode = 1;
                        app.CompleteRequest();
                    }
   
                }
                catch (Exception)
                {
                    app.Response.StatusCode = 403;
                    app.Response.SubStatusCode = 1;
                    app.CompleteRequest();
                }
            }
        }
    }
}

A few comments are in order since I'm not including all methods.

IsRequestRelevant(context): This method checks that the user is authenticated and that the request is for documents relating to an entity we want to control access via this method.
CrmService.GetService(); This method just returns an OrganizationServiceProxy.
GetRecordInfo(url); This method works out the record guid and what type of entity it is.

It would probably be a better idea to get all, or some, users and cache them on receipt of the first query.

Depending on the system's usage profile different types of caching make more or less sense. For instance, if users tend to access multiple documents within a record in a relatively short time, say 10 minutes, then it makes sense to cache the records and the user's right to them but if users only tend to access a single document within a record, this would make less sense. Consideration needs to be given to memory pressures that caching will create if not using a separate caching layer, such as Redis.

The simplest way of deploying this httpModule is by installing the assembly in the GAC and then manually modifying the web.config of the relevant MS SharePoint site by adding the httpModule to the modules in configuration/system.webServer/modules:

<add name="CRMUserPermissions" type="CRMUserPermissions.CRMUserPermissions, CRMUserPermissions,,Version=1.0.0.0, Culture=neutral, PublicKeyToken=87b3480442bff091"></add>
I will post how to do this properly, i.e. by packaging it up in a MS SharePoint solution in an upcoming post.