I have been meaning to write this post for ages as this is something that I have encountered time and time again during my career.
We have an integration layer on our application with our telephony system. A third party wrote this integration layer, in essence they have a web services that we communicate to and we expose a web service so that they can communicate with us, so far so simple, this is a staple of enterprise development and as I said I have had to deal with situations like this many times, both inside and outside the company, and if you've not done it inside your company and think that none of the problems that you are having would occur if you didn't have to deal with the idiots at Third Part Ltd., let me tell you that you will just have to deal with the idiots at the Other Department.
At any rate, integration was working fine when somebody realized that the spec called for SSL/TLS usage rather than using clear text. In theory this requirement made sense when the various sides of the integration equation where hosted by two different companies, but a change of architecture had been made, which meant that they weren't any longer, so using a secure website for an internal web interface that contained little more than phone numbers and UUIDs seemed like overkill, but the spec is the spec, agile be dammed.
So both Web Services and client apps on either side of the integration equation where reconfigured to use SSL/TLS and this is were the problems started and as it's normally the case in these situations, we started blaming each other.
Furthermore, a supplier customer relationship had been allowed to develop. This is a relationship in which the supplier has the technical know how, or believes he does, and the customer doesn't, or is at least believed not to have it by the supplier. Needless to say that this wasn't the case as we shall see, but for various personnel reasons, i.e. developers on our company leaving, this relationship had taken hold, which meant that our evidence carried less weight as it were, because they were already predisposed to assuming that they were talking to yet another developer who didn't have a clue about how the integration was supposed to be working, which wasn't true but it was true that they knew their system and the history, and while I knew ours but could not comment on historical decisions as this had landed on my lap without handover from somebody that left the company suddenly, not so suddenly but I digress.
We both went back to basis to prove our respective thesis, i.e. it was the other party's fault, because history is written by the victors, you know how this will turn out, but I will continue anyway. The first thing that we discovered was another disagreement in the spec, regarding authentication, which I remedied after a lot of hair pulling.
After that, I found that our side was working fine, furthermore, I used Wireshark to prove that nothing was coming through over the wire on port 443, while stuff was going through on port 80 from the client PC that hosted the client app, which meant that failures on their side were not due to our Web Service, this despite the fact that the app was throwing a 404 error and pointed me to this
link.
I mentioned the supplier customer relationship above, because it helps to explain why this evidence, our Wireshark evidence was ignored, they knew what they were doing we didn't so anything coming from our side was tainted.
To further compound the confusion, the client app would sometimes work when tested by them, which made us extremely suspicious and not work for us, which surely made them more suspicious of our general competence. It was working for them and they were dealing with a bunch of idiots so they probably were doing something stupid.
At this point they were willing to discuss the code that they were using for the first time and we realized that the source of the issue was their code, to be fair it was the usage of our proxy server, when it should not have been used, there were other issues but they are not important.
So I asked them to add this to the system.net element of the client's app.config:
<defaultProxy enabled="false"/>
Lo and behold, everything started to work fine.
Not sure what the lesson is here, I guess if there is one, it's that appearances do matter even in a supposedly analytical field like this one.
This
link seems tangentially related.