» Publishers, Monetize your RSS feeds with FeedShow: More infos (Show/Hide Ads)
This time Global 360 will introduce themselves to the group members, if you’re curious – check the details here
Check out tomorrow’s ONLINE SUBG meeting in which Jean-Pierre Auconie, the engineer behind the famous MessageBoxViewer tool talk us through it; details here
During the last few weeks I’ve been asked to review two separate projects, for two separate companies, developed – naturally – by two separate teams.
The two things both projects had in common were that they both had to deal with legacy “flat files” and they both chose to process these files outside BizTalk using custom code.
In both these cases I completely agree with the decision to use custom code to parse the incoming files – as good as the out-of-the-box flat file support in BizTalk is (made significantly better with the introduction of the flat file wizard, once one gets the hang of using it) – there’s no avoiding from writing custom code to parse flat files every now and then – some file formats are pretty challenging with different records types, conditional records, interleaving segments etc.
I do not agree with, however, the decision to perform this custom parsing outside BizTalk.
I’m pretty sure I would not even bother posting this point had I not seen two of these in the same month, but the fact that I did suggests it may be worth posting a quite note.
One of the projects had the code in a console app, called from a windows scheduled task; the application would pick up files from a folder, parse them, and drop the xml representation in another folder, for BizTalk to consume.
The other had a windows service monitoring a folder and pick up any files, parse them to a different, simplified, flat file format (!), and drop them in another folder for BizTalk to consume.
Both of these introduce another component to the mix; such component needs it’s own error handling, it’s own monitoring, deployment strategy, operations manual etc. similarly it includes a fair bit of re-inventing the wheel – writing code to monitor folders, read files, and write files – stuff that BizTalk is doing out of the box.
What would have been the correct approach then? quite simple – a custom disassembler in the receive pipeline -
Writing a custom disassembler it quite simple - at the end of the day, it boils down to developing a class library, which implements a few simple interfaces, the main one – IDisassemblerComponent defines two methods - Disassmeble and GetNext (the other interface are even simpler, almost insignificant in terms of effort)
Disassemble gets the source message and potentially parses it up front, GetNext is called repeatedly by the pipeline to receive 0 or more parsed messages, until it returns null. simple.
In one of the projects I’ve since taken their existing code (console app), refactored it into a class library, and wrapped it in a custom disassembler class that calls it; converting the scenario to a BizTalk pipeline and performing the key “developer testing” took less than a day.
Why did they not do it to begin with? whilst sometimes there are valid reasons, technical or otherwise, I suspect that in this case it was just unfamiliarity with with BizTalk and some lack of confidence in the development team’s ability to learn and implement (or their belief in themselves); these are valid concerns to any project manager, but I would suggest that a better course of action would have been to spend some time looking at what it takes to implement a custom disassembler, seeing that’s its not at all that scary, and by doing so learning more about a product used in the solution (BizTalk) and achieving a better architecture, and more maintainable approach.
I’ve been toying with message creation a few times in the past, and recently turned to Paolo for help with a question; Paolo has an amazing blog and he has now posted posted some of his wisdom around ways to create messages from a helper class to an orchestration on his blog; well worth reading (as any entry on this fantastic blob really!)
In a previous post I’ve mentioned our constant attempt to strike the right balance when it comes to loosely coupled services; I’ve mentioned that we were looking at two different scenarios – loosely coupling calls to services outside BizTalk and loosely coupling calls to services inside BizTalk (once implemented within the BizTalk group)
I’ve also mentioned that our solution is composed of a few distinct ‘areas’ (each one generally encapsulated as a BizTalk Application), which we consider, in most cases, to be service boundaries, and – within one ‘flow’ of an incoming request message, we will often have to cross one or more of these boundaries to achieve our end goal.
In most cases, in our solution, the ‘subscriber’ service would use the schema of the ‘publisher’ service for its incoming message; this roughly follows the principle of a service’s proxy, albeit a bit upside down - for practical reasons. Only that - and that’s a much bigger difference - we don’t create a copy of the schema as a service proxy would, but rather reference the schema of the publisher directly (through a shared assembly); this, of course, creates a strong dependency between the two and - over time – this has caused us a lot of headache around deployments as whenever we wanted to update the publisher, we’d have to remove the subscriber too.
Recently we have experimented with following more closely the service proxy approach and instead of using the same referenced schema (using a shared assembly), we’ve created an identical copy of it – same root node and namespace - in the ‘subscriber’ side.
The assumption was that we will be publishing a message using the copy of the schema the publisher holds and be receiving it using the copy of the schema the subscriber holds, but as the message itself will look exactly the same, and will inevitably have exactly the same message type, and so it would be picked up by the subscriber successfully.
Had it worked, we would be have been able to avoid the dependency between the subscriber and the publisher, which would help us gain much needed flexibility in the publisher to support, and change for, multiple subscribers.
Theoretically - if the publisher schema had to change (say – to support functionality required by other subscribers), as long as the change is backwards compatible such as added elements, we could replace the publisher copy of the schema, but leave the subscriber copy as is, until such a point that we need/want to update the subscriber process.
Well - in BizTalk 2006 – this would have worked just fine; unfortunately – from R2 onward, it no longer works – when an orchestration receives a message, it often does so based on a subscription that included the ‘messaging message-type’ (root node and namespace); however – starting with R2 – an additional check has been introduced – to compare the full .net type name of the schema used by the publisher message with the full .net type name of the schema used by the subscriber, assembly, version and all.
This check obviously fails in our scenario, and our fancy loosely coupling solution no longer works (in R2 or 2009).
I think this check is actually a result of code introduced as a hotfix for BizTalk 2004, which – for on reason or another did not make its way into BizTalk 2006 but did into later versions, but I’m not sure, either way – it’s important to note the workaround described at the bottom of the hotfix description, as it appears this behaviour can be turned off, but one would have to check carefully the potential impact.
What else could we do?
Well – one pattern we know that works fairly well is the broker pattern – there’s the publisher, with it’s own schema, there’s the subscriber, with its own – completely different schema, and there’s a broker – a third process that has dependencies on both and contains a map to convert one to the other; on the plus side – this gives us all the flexibility we need – at any one point we only need to deploy two entities – the publisher or the subscriber and the broker, which is good enough; having the process, with a map, allows us to use multi-part messages if we deem them suitable, and all the complexity we need in the mapping; on the down side there are more artifacts to deploy and manage but, more crucially, one extra message box hop which, in a low latency scenario as ours, is not a small price to pay.
Another option is to simply expose the subscriber as a service and call it as such – there are big benefits to doing that – including the fact that we can now have a copy of the schema, in the form of a proxy or without one, and we have also decoupled the services in terms of BizTalk groups –the other service can be anywhere, although this was never a requirements for us; however – we’re paying in more pub/sub again, as well as more IO and quite possibly more complexity.
Theoretically we could have also use XmlDocument (or any other generic wrapper, for that matter) to convey the message, but a) I don’t like typeless intechanges and, b) this does not work well in cases where correlation is required, as the following receive tend to short-cut the subscriber and pick up the request as a response, that is unless you’re willing to introduce two wrappers – one for the request and another one for the response.
Sandro Pereira has posted a question, and answer, in the BizTalk newsgroup (he also described his answer, in detail, on his blog) about debugging expression code in Visual Studio
He wasn’t referring to debugging code in helper classes, but code in expression and assignment shapes.
My answer was that this was not possible, but Sandro quickly proved me wrong, as he demonstrates in his answer and blog post, and this got me thinking –firstly – despite knowing about the option to use the generated code (and actually using it on very rare cases to understand a certain BizTalk behaviour) I had never thought of using it for debugging, and that is an interesting thought.
However – I had to wonder – how come I never came across the need to? in all those years of BizTalk development, not even once can I remember thinking – oh! I could solve this if only I could debug the piece of code in this shape..
The reason, I think, is two fold -
1. I rarely have more than 2-3 lines of code in an expression shape of any kind; if it’s not straight assignments, its going into a helper method; it’s cleaner, it’s more reusable, and it’s easier to debug.
2. I use trace. a lot. and so every few shapes or so, and certainly in expression shapes, I will have a trace line that outputs to a log file important information about the state, and the flow of the process; this proves to be invaluable when troubleshooting issues on the live environments, but also really helpful in development.
Recently we’ve started to consume a new version of a web service we’ve been using for a while.
We’ve known that, as a whole, not much had changed, only that they have now moved to WCF; they would have migrated their classes to VS 2008 but would expose pretty much the same functions, using pretty much the same parameters.
Still – it appears that BizTalk now insists on generating multiple schemas for the web reference, and as more of the service is moved across more schemas are introduced.
This caused Oleg a fair amount of pain as, when new schemas would be introduced, they would re-order the existing schemas, so reference1.xsd (in the web reference) would suddenly become reference2.xsd, which in turn break out maps.
The process of finding out the logic behind which schemas are created was fairly short and simple, but as I’ve documented it I thought I might as well share it -
Initial observation revealed that whilst the ASMX services’ WSDL file contains all the schemas needed, the WCF services using import statements in the WSDL file; the schemas exist in separate ‘files’.
The ASMX services always uses the XmlSerializer, WCF services use the DataContractSerializer by default, but can be configured to use the XmlSerializer if required.
Here’s a walk thorugh of the scenarios we’ve compared (using BizTalk 2006) -
Standard WCF project, DataContractFormat
We’ll start by comparing the standard WCF sample project generated when you create a new WCF service application in Visual Studio 2008 -
[ServiceContract]
public interface IService1
{
[OperationContract]
string GetData(int value);
[OperationContract]
CompositeType GetDataUsingDataContract(CompositeType composite);
}
[DataContract]
public class CompositeType
{
bool boolValue = true;
string stringValue = "Hello ";
[DataMember]
public bool BoolValue
{
get { return boolValue; }
set { boolValue = value; }
}
[DataMember]
public string StringValue
{
get { return stringValue; }
set { stringValue = value; }
}
}
Looking at the WSDL generated, 3 schemas are imported –
1. The usual generic types
2. The definition of the compositeType type
3. The definition of the service’s messages (GetData, GetDataResposne, GetDataUusingDataContarct, GetDataUsing DataContractResponse)
Adding a web reference to this service from a BizTalk 2006 project we can see it represents this fairly accurately -
We can see all 3 schemas downloaded from the service, but within the reference.map generated code a single reference.odx defined the methods in the form of ports and web-messages, and reference.xsd defineds the compositeType schema.
Equivalent project in an ASMX service
I’ve created an equivalent ASMX service, which looks like this –
[WebService(Namespace = "http://tempuri.org/")]
public class Service1 : System.Web.Services.WebService
{
[WebMethod]
public CompositeType HelloWorld(CompositeType composite)
{
CompositeType response = new CompositeType();
response.StringValue = "Hello World";
return response;
}
}
public class CompositeType
{
bool boolValue = true;
string stringValue = "Hello ";
public bool BoolValue
{
get { return boolValue; }
set { boolValue = value; }
}
public string StringValue
{
get { return stringValue; }
set { stringValue = value; }
}
}
Publishing this service I can see it’s WSDL contains (does not use import, but that proved to be insignificant) a single schema that represents the service’s messages and the compositeType definition.
Consume this service from a BizTalk 2006 project and only the WSDL file is downloaded (there are no ‘external schemas’ to worry about) but within the reference.map pretty much the same odx and xsd files are generated, no real difference between ASMX and WCF here.
Next I’ve looked at changing the serializer the WCF service works with from DataContract to XmlSerializer –
Standard WCF project, XmlSerializerFormatter
Now we will change the serializer to XmlSerializer by adding XmlSerializerFormatAttributre to both the service and the data contracts
[ServiceContract]
[XmlSerializerFormat]
public interface IService1
{
…and
[DataContract]
[XmlSerializerFormat]
public class CompositeType
{
The WSDL in this case includes only one import, for a single schema representing both the service messages and the compositeType schema (basic types are not exposed) and BizTalk now only has one schema downloaded, but again – the reference.map code remained identical (one ODX, one schema)
How will adding a second namespace affect these behaviours? Lets investigate -
WCF project, two namespaces DataContractFormat
To demonstrate this I’ll add another data contract - AnotherCompositeType, specify an explicit namespace for it and include it as a second parameter to the GetDataUsingDataContract operation -
[DataContract(Namespace="HttpL://SomeNamespace")]
public class AnotherCompositeType
{
bool boolValue = true;
string stringValue = "Hello ";
[DataMember]
public bool BoolValue
{
get { return boolValue; }
set { boolValue = value; }
}
[DataMember]
public string StringValue
{
get { return stringValue; }
set { stringValue = value; }
}
}
[OperationContract]
CompositeType GetDataUsingDataContract(CompositeType composite, AnotherCompositeType anotherComposite);
Using DataContractFormat again, but with two classes, representing two different namespaces, we’re now getting yet another schema - the fourth one - representing the added data contract (if the namespaces of both data contracts were the same, the DataContractFormat would have included them in the same schema)
On the BizTalk side, the reference.map code now also contains a second schema, one describes the original CompositeType, and a second describes the second type – AnotherCompositeType and here as well – were the two types in the same namespace, a single schema would exist, describing both.
Let’s look at the same again, using the XmlSerializerFormat
WCF project, two namespaces XmlSerializerFormat
Adding the XmlSerializerFormat, I also have to remember to include the XmlRoot attribute to set the namespace, as the serializer does not look at the DataContract attribute -
[DataContract(Namespace = "http://SomeNamesapce")]
[XmlSerializerFormat]
[XmlRoot(Namespace = "http://SomeNamesapce")]
public class AnotherCompositeType
{
bool boolValue = true;
string stringValue = "Hello ";
[DataMember]
public bool BoolValue
{
get { return boolValue; }
set { boolValue = value; }
}
[DataMember]
public string StringValue
{
get { return stringValue; }
set { stringValue = value; }
}
}
Now the WSDL for this service, using the XmlSerialiserFormat imports two schemas – one for the service messages, and the CompositeType schema, all reside in the same namespace, and a second for AnotherCompositeType which is defined in a separate namespace
Consuming this from BizTalk and again I’m getting two schemas – one for each namespace.
So far – switching between DataContractFormat and XmlSerializerFormat made no difference to the generated code under reference.map, but it did change the way the WSDL is constructed (import vs. embededed schemas) and therefore the downloaded components (wsdl and schemas, vs. wsdl only)
Note - another thing I’ve noticed is that when a new schema needs to be generated under the reference.map code, as a result of a change to the service, updating web reference does not seem to do so; I had to delete the web reference and re-add it to see the newly added schema.
Last - let’s look at how the ASMX service behaves with two namespaces –
ASMX service with two namespaces
I’ve added the second class, and added it as a parameter to my web method
[XmlRoot(Namespace = "http://AontherNamespace")]
public class AnotherCompositeType
{
bool boolValue = true;
string stringValue = "Hello ";
public bool BoolValue
{
get { return boolValue; }
set { boolValue = value; }
}
public string StringValue
{
get { return stringValue; }
set { stringValue = value; }
}
}
And, still consistent – when consumed from BizTalk 2006 I’m getting only the WSDL downloaded (two schemas are embedded) but the reference.map code contains two schemas – one for each namespace.
To summarise -
Using the DataContractFormat you will always get one schema for generic types, one schema for the service’s messages, and then one schema for each namespace any other types are declared in (0..n)
Using the XmlSerializerFormat schemas are embedded in the WSDL file, and you would get one per xml namespace used.
As far as BizTalk generated code is concerned, however, there’s no difference between the two.
What this meant to us – well – we understand better, but there’s still not much we can do.
In our case – we control the service, and – in fact - we know that the only reason we encounter multiple xml namespaces in the service contract is because the various classes exist in several .net namespaces, and they have not supplied the DataContract attribute on any class, they have certainly no supplied the namespace parameter to that attribute, which meant the .net namespace was used as the xml namespace, resulting with multiple namespaces and therefore multiple schemas.
One that team added the attribute, and used a consistent xml namespace throughout, our immediate problem was solved; however - had it been a third party’s service we would not have that luxury and we would have had to update our code whenever we update our web reference, even if only new types were added in a backwards compatible way (as the schema ordering may have changed)
On that – it’s probably easier to simply rename the schemas (and the underlying .net types) under the reference.map generated code rather than the referencing maps and messages.
Richard Seroter wrote a review of this book on his blog
Whilst I haven’t yet finished reading the book, I completely agree with Richard – this book is very well written, and is doing a fantastic job explaining this fascinating, and often misunderstood, if not completely overlooked, capability of BizTalk.
The book also does a very good job at looking at scenarios outside BIzTalk server, making it well worth reading for any enterprise solution architect.
Highly recommended.
A while back I’ve posted about the different ways to create messages in an orchestration, and later some performance comparison between them.
Mostly for fun I run a quick test on my newly installed laptop; I did not put nearly as much effort as I have previously, so don’t make out of these numbers too much, but I was amazed to see that all the results were running pretty much 10 times faster.
Now – it’s a new BizTalk (2009), new SQL server (2008), new operating system (Windows 7) and a new(-ish) laptop (Thinkpad T61), so there’s no way to know how much each component contributed to the improvement, but it is amazing how much difference can exist after just one year!
Well – not at all scientific, but I found it interesting anyway!
Ewan Fairweather is doing a web cast TODAY on BizTalk 2009 performance tests he’s done both on and off Hyper-V infrastructure.
I’ve seen bits of it before and is highly recommended- you can pretty much count on this being extremely useful if you’re serious with you BizTalk deployment.
When you generate a class out of a schema with an element configured to allow mixed content (child attributes and elements as well as text), you should expect the corresponding generated field type to be a string array;
So - if you have a schema that looks like this
<?xml version="1.0" encoding="utf-8"?>
<xs:schema targetNamespace="http://tempuri.org/XMLSchema.xsd" elementFormDefault="qualified" xmlns="http://tempuri.org/XMLSchema.xsd" xmlns:mstns="http://tempuri.org/XMLSchema.xsd"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="SomeElement">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="Child1" type="xs:string"/>
<xs:element name="Child2" type="xs:string"/>
<xs:element name="Child3" type="xs:string"/>
</xs:sequence>
<xs:attribute name="SomeAttribute" type="xs:string"/>
</xs:complexType>
</xs:element>
</xs:schema>
(‘SomeElement’ being a complex type allowing mixed content)
The fields in the generated class would look like
public partial class SomeElement {
private string child1Field;
private string child2Field;
private string child3Field;
private string[] textField;
private string someAttributeField;
.
.
.
The reason for the array of strings (instead of just one string field) is that an XML corresponding to the schema might look like this –
<SomeElement xmlns="http://tempuri.org/XMLSchema.xsd" SomeAttribute="someAttributeValue">
Some free text
<Child1>Child1 text</Child1>
Some more free text
<Child2>Child2 text</Child2>
yet some more free text
<Child3>Child3 text</Child3>
</SomeElement>
And so by using a string array to hold the text the deserialiser can keep string portions separately.
Initially, I thought, this allows the structure to represent the original xml accurately, but this is not exactly the case – you would still not know for certain where each string portion existed, especially if in the source XML you get a few elements that don’t have text between them, which , I suspect, is why when I serialise the instance back to xml I actually get –
<?xml version="1.0"?>
<SomeElement xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" SomeAttribute="someAttributeValue" xmlns="http://tempuri.org/XMLSchema.xsd">
<Child1>Child1 text</Child1>
<Child2>Child2 text</Child2>
<Child3>Child3 text</Child3>
Some free text
Some more free text
yet some more free text
</SomeElement>
Now, I don’t particularly like this sort of xml, and shy away from mixed content; I don’t believe that xml snippets like my samples above are useful, specifically I don’t think that mixing elements and text is particularly nice.
However, consider an element with an attribute and some text – the following is quite reasonable I think, and yet requires mixed content -
<Phone type="mobile">some text here</Phone>
I’m working on a fairly large BizTalk implementation, which in many ways does not fit the classic BizTalk project pattern in that, whilst it has a large element of integration to it, it is even more a business process rich application.
One of the things that constantly keeps us busy is trying to strike the right balance between complexity/performance and maintenance, and a big part of that is finding just how loosely coupled the solution can and should be; this is something I find ourselves spending more and more time on.
There are two parts to this challenge for us – one is about ensuring our processes are not too tightly coupled with any of the many services we’re calling (internal or external) and the other is that different ‘areas’ of our solution, although implemented on BizTalk, and even in most cases within the same group, are treated as independent ‘services’ and are decoupled from each other; we’ve learnt the hard way the price of not getting good enough – mostly in terms of maintenance complexity and cost and are keen to find the right answers, if there are any.
In this part I would like to start tackling the first point – calling services -
In a recent post I ranted somewhat about the WCF’s adapter lack of support for multi-part messages, a topic I’m likely to come back to every now and then I suspect.
This was triggered by my attempt to prototype a way to call services from orchestrations in a loosely coupled way – ideally without the orchestration having to know about the service implementation (isn’t that the promise behind BizTalk Server?)
Naively, I started with a very simple plan – the orchestration would have its own representation for the service call, which it would publish to the message box; the send port, configured with the WCF adapter, would pick up the request based on some subscription, a map in the send port would convert the request to the service’s format and the service would be called; the response would be handled in a similar way.
I have documented that approach here
This worked pretty well in my little prototype, and I was rather pleased with myself for a short while, at least until Ben was tasked with taking this approach into our real world; it took very little time for it to break – all he had to do is call a real service - one that takes more than one parameter (how’s that for a lesson to not over simplify prototypes…)
To be fair, this needn’t be an issue, had we not used multipart messages in our processes, but – accustomed to using the SOAP adapter were using multi-part messages to represent the numerous parameters to the service, and – as I’ve hinted in my previous post – I do sincerely believe this to be the better approach.
That meant two things – 1. the WCF adapter would not be able to transmit the message, as it does not support multipart messages, and - 2. we would only be able to map the body part of the message.
These pretty much put a lid on my idea.
The WCF adapter’s lack of support for multi-part messages we could probably work around using a custom assembler or encoder – such a component could take a multipart message and convert it to a single part message using some pre-defined rules and/or part names, context properties, etc. but the map limitation really puts the lid on this approach.
What can we do then? well – we’d have to create a schema in our process that contains all the parts needed for the service calls, this does not have to look like the service contract, but it does have to contain all the information required to construct the service’s request and effectively means describing the multipart message in XSD; this makes me slightly uncomfortable as, in a sense, this schema exists specifically for this service; it also means we would have to work harder in the process to construct this message – on top of creating and deploying another schema, we would also need some map to convert the multiple messages containing the information to the single entity (or use some code to achieve the same), but it would work – once we have the single part message, we could publish it, map it to the service’s format in the port and deliver it to the service, in the exact same way that worked in my prototype.
At the moment, this looks to me as our only bet, but I’m looking for alternatives, if anyone has any suggestions?
You can read the details here
Interestingly I think the scope of the problem was bigger than we originally thought, certainly bigger that what I blogged about anyway, in that even if you don't have the tranform shape problem (for exmample - if you've created the orchestration on a machine with the hotfix, but then opened it in a machine without it) you can have some issues -
In one case we had a map, where the input was composed of several schemas; as you probably know BizTalk effectively creates a schema that aggregate all the parts into a single schema; to achieve this it had to impot several schemas from our solution, all share the same namespace;
When we tried to build the project with the map, we received errors saying a certain element type could not be found, the cause was that BizTalk was looking for it in the wrong schema, as it had several schemas imported into the same namespace, it chose the wonrg one and could not find the type definition.
Once we've installed the fix, and with no other changes, the map compiled all right.
I wrote myself notes for this post months ago, but haven’t taken the time to write it down properly; a passionate discussion today with Rupert Benbrook prompted me to get back to this.
The WCF adapter does not support multi-part messages, and some would argue that’s a suggestion of the feature possibly being dropped in some future version of the product.
This small fact seems to have been glossed over by most – I, for one, haven’t heard this discussed anywhere, to the point that I’m left wondering if I’m really just “stuck in my ways”, but I honestly think that’s a poor decision.
Trying to search for some detail about this fact I found the following, rather old, but still very valid and informative, post; here are some of the key points made -
The reason support for multi-part message in WCF adapter is not being recommended is because WCF does not support it and the model proposed is that whoever wants to use something equivalent, need to aggregate the messages into one xml and send it as one single-part message.
(Tapas Nayak, Principal SDE, BizTalk Adapter Pack)
Multipart messages are a marginalized feature in BizTalk, the pipeline components like the flat-file and xml don’t understand them, same story for the majority of the adapters. So we decided to not support them in the WCF Adapter too(John Taylor, Senior Technical Lead, BizTalk Server)
With regards to the first statement – I think that is somewhat down to interpretation – WCF did not change how web services work; WSDL is the same WSDL, specifications have not really changed; so – WCF is just a new (and of course – much better) implementation; this is an important basis for any discussion because it indicates that the WCF adapter could have taken the same path the SOAP adapter did.
Consider the following Service definition in WCF:
[OperationContract]
CompositeType GetDataUsingDataContract(CompositeType composite, CompositeType composite2, out CompositeType compositeOut);
or the equivalent web method in asmx:
[WebMethod]
public CompositeType GetDataUsingDataContract(CompositeType composite, CompositeType composite2, out CompositeType compositeOut)
{
}
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
the WSDL for both methods is pretty much identical, the operation definition looks as follows -
<wsdl:operation name="GetDataUsingDataContract">
<wsdl:input wsaw:Action="http://tempuri.org/IService1/GetDataUsingDataContract" message="tns:IService1_GetDataUsingDataContract_InputMessage"/>
<wsdl:output wsaw:Action="http://tempuri.org/IService1/GetDataUsingDataContractResponse" message="tns:IService1_GetDataUsingDataContract_OutputMessage"/>
</wsdl:operation>
the messages are defined as so -
<wsdl:message name="IService1_GetDataUsingDataContract_InputMessage">
<wsdl:part name="parameters" element="tns:GetDataUsingDataContract"/>
</wsdl:message>
<wsdl:message name="IService1_GetDataUsingDataContract_OutputMessage">
<wsdl:part name="parameters" element="tns:GetDataUsingDataContractResponse"/>
</wsdl:message>
And the schemas, in both cases, define the aggregated messages – with all the in/out parameters as required.
So – what is the difference? the way I see it is it is in the way the different adapters handle the messages as they pass through the adapter; the SOAP adapter has a layer in which, when receiving messages from the outside, breaks them to individual parts, delivering a multi-part message to the message box; similarly, when delivering messages from BizTalk to the service, it composes them, aggregating the various parts in the request message, to a single SOAP envelope; the WCF adapter does not include such functionality – it will take the message BizTalk provides and will wrap it with the SOAP envelope pretty much as is.
All my grievances about the SOAP adapter aside – I think its approach is a good one; it allowed the process to focus about the flow of the business process and avoided the need to define and construct artificial entities in the form of aggregated schemas, which often make very little sense (as the various parts can be largely unrelated from a process perspective, but still necessary for the service to perform its operation).
In my discussion with Rupert the same point was raised – the fact that SOAP supports one message in and one message out (i.e. no multipart messages), but in my view that’s a problematic argument – again – the whole point of the BizTalk architecture is that the send port can mask the transport mechanisms; ok – the SOAP adapter is not doing a particularly good job at that, but that’s hardly an excuse to make matters worse – I don’t want to have to know that I need an aggregated message to call this particular message because it uses the WCF adapter;
Moving on - I particularly disagree with the second statement mentioned - I think multipart messages are a very important feature in BizTalk, and would be surprised if any seasoned BizTalk developer would tell me they prefer not to use them (do let me know if you think I’m wrong); to start with – it allows one to define a type once and then use it multiple times; this makes maintenance infinitely better when types need to be changed (I thought it ought to be this schema, but found that this other schema is better); ok – so you have to be careful about how and where you re-use, to not introduce unwanted dependencies, but if you have large orchestrations which follow several steps, this is very useful.
Secondly, it allows one to easily link pieces of information that are only ‘somewhat-related’, especially important, in my view, when using pub/sub; having to pass several pieces of information from the publisher to the subscribers; the alternative, not surprisingly, would be exactly what is suggested above – create a schema defining the various pieces of information required and use that instead, but I think that’s a much larger maintenance overhead than a multi-part message definition; it is much easier to change a message definition that it is to change a schema; especially in large implementation with many different assemblies;
This was the point most passionately discussed with Rupert today – he argued that if a message is important enough to be published (with all its parts) it is important enough to create a schema for it; the point I was making, possibly not good enough, is that – to me, specifically in this area, the two are the same – I can define my aggregated message in a schema, that would have, say – 2 elements, each using a type defined in my other, existing schemas, or I could use a multi-part message, with two parts, each pointing at existing schemas; there’s very little difference in principal – I’m defining something – strongly typed and well defined, that is composed of two other, well known things, and I’m passing it around.
Obviously interoperability can only be achieved using standards like XSD, but that’s not a consideration at this point I’m publishing a message from BizTalk, which would be subscribed to by BizTalk, what’s wrong with describing it using BizTalk terms where conceptually there’s virtually no difference?
If there is no difference – why do I care? because multi-part messages are easier to create than aggregated schemas, and there’s less deployed artifact; I can simply put an assign shape and assign existing messages to the relevant parts; if I was to go down the aggregated schema route I would have to define the schema and create a map that would aggregate the messages; I could use code, or I could use xpath to assign the parts to the various elements, but I’d have to use a map or something to create the shell of the message first; seems like a lot of overhead for simply passing a couple of messages between processes.
I agree that many adapters don’t support multipart messages (the file and ftp adapters are the most prominent examples, outside the WCF adapter), and in many cases this makes perfect sense, but some do – SMTP and POP adapters, for example, use it very well; so does the SOAP adapter, of course.
There was one piece where we were in mostly in consensus, about the benefits of a single schema versus multi-part messages – when you wish to have a map in the port, I believe BizTalk would only consider the body part of the message, but again – to me that’s a shortfall in the product, not a proof that the existing feature is not useful; for both this and the pub/sub scenario I would have lover to be able to treat a multi-part message type in the same way a single-part message is treated – with a specific type that you can subscribe on (I can always set the message type manually, I guess, but would have been great to get out of the box support)
Similarly I agree that the built in xml and FF disassemblers don’t support multipart messages ( only in the sense that they will not disassemble them), but the model does – you can easily access, and parse, multipart messages from within a pipeline component, even a custom disassembler, so I can’t help wondering if the answer is in the following statement made by John Taylor, as quoted in that blog post -
The feature set for the WCF Adapter was potentially very large indeed and we were under some pressure to make cuts where we could.
and I think it’s a big shame.
I, for one, would be very disappointed if multi-part messages stop being supported, and I’m slightly worried – WCF is obviously taking more and more front stage, generally and with BizTalk specifically (even more so with the LOB adapter SDK and the BizTalk adapter pack that makes use of it), and as it does not support multi-part messages, it may well set the tone for future BizTalk developments.
‘Shiri’ had posted this question in the newsgroup -…
…After open the orchestration
debugger both at BT 2004 and BT 2006 we've recognized a different behaviour:
at 2004 executes through all the branches first shape and then back to the
first branch at parallel and executes the second ahape, the second branch
and etc. at 2006 executes the first branch - first and second shape, then
goes to the second branch and executes the first and second and etc.
It seems that 2004 works more like multithreaded then 2006….
I vaguely remember reading/discussing this difference in the past, but – unfortunately - I can't remember the exact details; I've floated this question around again, and I believe (but treat this with care – this may well be at least somewhat inaccurate) that there has indeed been a change in the behaviour of the parallel shape in BizTalk 2006, here’s some background -
The parallel shape was never intended to provide ‘true’ parallelism, and Microsoft has been fairly clear from the start that BizTalk will not process each branch on it’s own thread (which would have been required).
Darren Jefford explains very well the intention, and the expected behaviour, of the parallel shape in his book Professional BizTalk 2006, which I highly recommend; if you want a quick peek you can read the relevant piece here, the key point he makes is that when thinking about the parallel shape, you need to wear your business analyst hat, and not the developer hat – the parallel shape effectively “says” – hey - I’ll run this code (=branch), but if I reach a point where I’m sitting idle waiting for something to happen (receive, delay or listen shapes) I will go and run that other code (=branch) in the meantime; when I’m through with that (or reached a waiting point again) I will check if I’m ready to process the rest of the first bit of code…and so on…
This is not quite your techie run-things-in-parallel-on-multiple-threads approach, but – from my experience – it is more than enough (if you wanted to run things completely in parallel you could use BizTalk’s pub/sub, which would allow you to potentially get a lot more than just one thread – you might end up on a different machine altogether, for a price :-))
So – indeed – the BizTalk 2006’s (and subsequent versions) parallel shape behaves exactly as I understand it should do (and as is described in Darren’s book) and is consistent with Shiri’s observation - if you have three branches, and neither have a blocking shape – the left most branch will be executed completely, then the next one to its right and then the right most branch; however –and that’s a very important point to remember - Microsoft does not, to the best of my knowledge, guarantee any order of processing between the branches, and this might change in future versions (as indeed it has been seen from 2004 to 2006), so all you can assume is that all the branches can theoretically run in any arbitrary order or indeed in parallel.
Back to the 2006 behaviour - if, however, you had a receive shape as the second shape in the left most branch, when BizTalk would hit this shape it would move on to execute the second branch while it’s waiting for the message to be received; it would come back to the first branch at the earliest point once two things had happened – 1. the message it was waiting for was delivered and 2. it had reached a point in the currently executing branch in which it can stop and re-enter the first branch; this would be a receive shape, delay shape, listen shape or the end of the branch.
So – if that’s the 2006 behaviour, was the behaviour in 2004 different? yes, I believe it was – in 2004 the engine was, some would say, trying to be too clever - BizTalk 2004 would try to run branches on multiple thread if it can; where “if it can” depends on several factors, not the least the state of the thread pool at the time of evaluation; if it managed to do so, you would get code running truly in parallel, as Shiri observed, but there are no guarantees that this would be the case; in that sense BizTalk 2004 is less predictable than later versions of BizTalk, which is exactly the problem with this approach, and - considering that this was never the intention to begin with – I can fully understand the decision to simplify the model in BizTalk 2006.
Was a somewhat painful exercise. sure – it was not helped by the fact that I’m not really familiar with DNN, nor am I really a web developer by any stretch of the imagination, but never mind that – DNN has certain ‘features’ that made making it play nice with the WIF somewhat ‘challenging’. below are some points we’ve encountered that are worth remembering/considering (in no particular order) -
DNN will redirect to ErrorDisplay.aspx on most errors.
This means that once you’ve configured the web site with FAM and automatic passive redirects, most errors will send you in an endless redirect loop.
In the first few attempts I did to integrate the two, the aliases for my portals in the DNN database were only configured for ‘localhost’; as my STS is on a remote machine (very important for testing federated identity scenarios, obviously) I was now accessing my portal using the machine’s name and/or IP address, for which aliases had to be defined; this error was caught by DNN before the FAM had a chance to execute and so the call to ErrorDisplay.aspx was made when the user was still unauthenticated, but now it no longer carried a security token, which caused the redirect back to the STS and the infinite loop; to avoid that problem, I’ve added a location configuration in the web.config for ErrorDisplay.aspx and set it’s authorisation settings to allow all users – this allowed the error to be displayed despite of the fact the user is not authenticated (something that needs to be considered carefully, of course, but we’re not showing any dangerous details on our error page.)
The membership module performs a lookup for the username in DNN’s membership database.
As we’ve moved the authentication work from DNN to the STS, it is now possible for an authenticated user to not exist in the DNN database; we already have synchronisation process that keeps the two databases (our membership database, used by the STS, and the DNN membership database) aligned as we needed that anyway, but there’s always the chance of things getting out of sync; out of the box, the DNN membership module redirects the user back to the home page, in our case - because we’re using the FAM with auto passive redirects, this will enter an infinite redirect again (as the redirect to the homepage loses the security token), so we’re looking to change the on-error redirect to an allowed page (not an easy task in DNN, it appears)
DNN can host several portals.
DNN supports hosting many portals on the one application (/virtual directory) - driven by database configuration; that means that the FAM configuration, driven via web.config, does not fit very well as we’d have a single realm/replyTo address.
There were two optinos to choose from – we could decide that the entire DNN instance is a single RP, which would mean the existing configuration solution could suffice, but is a security risk – a user’s permissions cannot be checked at the STS at a portal level, only at ‘DNN level’; the other option was to treat each portal independatly, for this to work we had to set the realm and replyto values in the request to the STS dynamic as the configuration story was not enough; and so - we’ve extended the FAM by overriding OnRedirectingToIdentityProvider and setting the realm and reply properties of the SignInRequestMessage dynamically (based on the HttpRequest, in our case)
AudienceURI’s
Setting the realm and replyTo dynamically, as described above, raised another challenge - the Geneva Frameworks would like you to specify the audienceUris you’d be expecting in the tokens received from the STS; generally – this setting exists in the web.config, but as our realm can be one of many things we were faced with two options – either list all possible audienceURIs in the ‘allowed’ list, which has some security implications, or provide a mechanism to dynamically evaluate the request arriving and see if its allowed.
The problem with the first approach, out-of-the-box, is that it means keep updating the web.config of DNN with newly added portals’ Uris; this actually has two implications – one: it means that whoever sets up a new portal (which is a DNN user, generally), must have access to the config file – not quite what we had planned (or can live with), two: whenever the web.config gets edited, if we did allow that, the appdomain is reloaded, which kicks users our of their sessions (or so I’m told) as well as makes the next call really slow as the application is recompiled; the outcome was clear – we must stay away from the web.config
It appears that there are several ways around this:
- You can have a single AudienceURI in the RP side, and have code your STS to always return the same one (for all portals, despite the realm provided); you will need a way to find the audienceUri to use (as its no longer the realm from the request, which is generally used), but that’s possible through configuration; you are also introducing a risk as DNN – the RP – will now accept tokens across portals, but that risk can be mitigated by DNN’s own authorisation.
- You can load the audienceURIs section of the configuration in the RP from somewhere else but the web.config (a database table, for example); to do this you would need to add a handler to the FederatedAuthentication.ServiceConfigurationCreated event in the FAM (best way is through the constructor of your custom FAM, InitializeModule is called after the configuration has been loaded) on the RP side and set the audience uri for each portal alias in the DNN database; in a sense this is the RP version of the previous option as it will allow any token to any allowed portal access, even if the token is issued to one portal and the redirect goes to another; it does solve the need to edit the web.config, but it does not solve the need to restart the app domain when changes have been made as the call to the database will only happen once – when the module is initialised.
Both the options above provide some answer to the first approach mentioned at the beginning of this section – allow access to all possible realms, without having to edit web.configs.
The two options below talk about how you could implement a more dynamic check – moving further away from the existing method of checking against a static list of audienceUris -
- You can implement your own SamlSecurityTokenHandler (you might want to implement two – one that inherits from Saml2SecurityTokenHandler and another that inherits from Saml11SecurityTokenHandler) in which you would override the ValidateConditions method; in ValidateConditions you would call the base ValidateConditions method, passing in false for the ‘enforceAudienceRestictions’ parameter – this would ensure the configured audienceUris are not checked by the base method; you would then implement your own audienceUri validation, presumably against the DNN database (the conditions parameter passed to you will contain the audienceUri provided by the STS); you could use either code or configuration to setup your RP to use these tokens instead of the built in ones.
- A slightly re-factored version of the above is to wrap the validation code required in a custom SamlSecurityTokenRequirement class in which you override the ValidateAudienceRestriction method; both Saml2SecurityTokenHandler and Saml11SecurityTokenHandler classes allow you to provide them with a custom SamlSecurityTokenRequirement class in order to override the built-in logic; this allows you to write the validation logic just once; you will need to replace the default class in the token handlers with yours, which is best done in the same ServiceConfigurationCreated event mentioned above.
DNN’s UrlRewriteModule will redirect any request to the DNN vdirs copying any paths “lower” than the DNN vdir to the query string -
This means that even without dynamically setting the realm and reply address as described above, out-of-the-box you would simply get an infinite redirect situation due to lost cookies -
If you have a portal with the alias www.MyDomain.com/DNN/MyPortal and you set the realm and reply to to this address, you get redirected to the STS and then back to the portal correctly; cookies will be set at the URL above.
However, the UrlRewriteModulre will redirect the request to to www.MyDomain.com/DNN/default.aspx?alias=MyPortal; as the authentication cookies are stored in the original location (one or more virtual directories “lower” than the DNN root directory) they cannot be found and so the user is redirected back to the STS and so on…
The obvious solution is to set the cookie handler path in the configuration to be the DNN’s virtual directory – it would mean that moving between portals would require obtaining a new token and a roundtrip to the STS, but it will solve the circular redirect, which is better (and in any case this is not a likely scenario as far as real users are concerned, as they will normally work on a single portal)
Logout –
Ok – possibly the easiest point encountered – we needed to replace the logout functionality (which would logout the user locally from DNN) to the ws-federation single sign out supported by our STS; to do this we simply replace the code in Login.ascx.vb (in our case it existed in [path to root DNN website]\admin\skins) from the out-of-the-box redirect to a redirect to to the STS with ?ws=wsingout1.0 in the query string.
Going over most items in this list with Jon Simpson – the chief architect in one of the companies I work with – he raised an interesting point – why does every time the Geneva Framework gets upset the user ends up in an endless redirect loop? (ok, he didn’t put it quite in those words, and also generally his view is, naturally, limited to the bits he cares, or – have to worry – about, but the man has a point)
I did not buy into this initially, but I suspect there’s a lot of sense in expecting the framework to recognise there’s an endless loop in place and display some error instead of keeping it going; we will certainly need to do something on our end, as an endless redirect, even if cause by a configuration error on our part, is not acceptable, but this should be part of the framework, or so Jon says :-).
Somewhat quietly Microsoft released another CTP of ‘Oslo’, which you can find on the Oslo dev centre; it is great to see the team release early and frequent drops of ‘Oslo’, especially given the impact ‘Oslo’ is likely to have on how we build software.
Having the chance to play with the bits so early, and provide feedback, which they seem to be very keen on receiving, is pretty awesome!
I’ve only had a brief play with it and I guess that, to me, the greatest news are that – all the potential benefits aside – my existing code (BTSDF) still worked as is; of course this is only temporary, as the team have made significant changes (read: improvements) to the API, but they have kept backwards compatibility TEMPORARILY as they work to align all their existing code to the new model as the rest of us worry about ours.
So, next I needed to spend some time looking at the new release in detail and align my code with it, so I reap some of the benefits from the improvements made, but that’s a much better position to be in than – “it’s all broken now and I need to figure out how to fix it”m kudos guys!
Paul Arundel was kind enough to give me a gentle nudge to start taking a look at these changes, and it certainly took me longer than I would have wanted to get around to these things thanks to other commitments (a repeating theme here, recently), but – slowly but surely – I went through my code and am happy to say I’ve now published the necessary changes to codeplex.
I was going to write a post on the changes required when moving to the new code, but Paul had done so already, and so there’s little point in me saying pretty much the same things (Paul is only focused on the aspects that are relevant to his project, but as we both focus on pretty much the same area my words would have been pretty much identical).
There only one point I think is worth mentioning from my end – it’s probably only me – but being able to walk to graph in an easier and more convenient way, and being able to easily access nodes by label (or by checking their brand) motivated me to work more on the grammar itself, or – to be more specific – on the projection of the grammar – where previously i would just jump through whatever hoops were needed to parse whatever projection I got.
This is a very good thing I think (although it is the grammar itself that really matters, as this is what users will see); the other half of this, though, is that it is still not quite possible to get the projection just right – I can’t get a projection that would work well with any M model for my domain, for instance, but I suspect the smart guys in Redmond are hard at work sorting this out for us…
I bumped into a situation where a query string passed from one RP to another RP was lost as the request went through the STS.
Looking through this carefully I think I figured out why this would happen, and it turns out that the scope of this issue is slightly bigger (and therefore the title of my post is slightly misleading), see the details below - but first – here’s how it goes when everything goes well -
A user tries to access an RP url with some query string parameters; the RP configuration redirects the user to the STS providing realm uri and reply address; as these two values are generally retrieved from the RP configuration, they are unlikely to include the query string parameter.
Good news are, though, that the FAM keeps the original URL requested by the user (excluding the scheme and domain) in a context field passed to the STS with the Sign-in request(the ‘ru’ field).
Once the STS authenticates the user it redirects the request to the reply address provided by the RP; this is NOT the original address the user requested (now in the context) but the reply value provided by the RP through the request, and it will NOT contain the query string parameter.
The FAM, at the RP side, now extracts the URL the user requested from the context and does a second redirect to that URL – the one with the query string.
This time the local cookie is found, the user is authenticated, and the request can be processed by the RP code, where the query string is available to be read.
So – Where is it broken? it appears that if, after all of the above had happened, a request to the RP, with some query string, is redirected again to the STS for authentication for some reason, as the request comes back from the STS to the RP, the SignInResponseMessage is being ignored (as there’s already an authentication cookie) and that second redirect, to the URL in the context does not happen.
In this case, the RP code that will get executed in the page defined in the reply URL in the configuration, which is not likely to have the query string the user requested; in fact – and here’s the bigger scope – it may well be a completely different page to the one the user requested in the browser!
So – the question is – why would I be redirected to the STS when there’s already an authentication cookie for this realm? (after all – if there is cookie I would not be redirected to the STS, right?!) - it turns out there are some ‘edge' cases where this is possible, here’s one -
Step 1
1. Some link somewhere sends the user to the RP using domain name in the URL, with some query string parameters – http://MyDomain.com/SomeApp/Default.aspx?Something=SomethingElse
2.The RP redirects the request to the STS for authentication, providing the realm and reply from the configuration; let’s assume the reply address was set to http://MyDomain.com/SomeApp/SSO.aspx
3. The STS authenticates the user, and redirects back to the reply URL above
4. At the RP, the FAM processes the SignInResponseMessage , stores the authentication cookies for MyDomain.com/SomeApp and then redirects to the context’s http://MyDomain.com/SomeApp/Default.aspx?Something=SomethingElse (note that SSO.aspx never got executed, and that the query string is now available for Default.aspx)
Step 2
5. Now assume that another link somewhere sends the user to the same application, but for whatever reason it uses the server’s IP address instead of the domain – /SomeApp?SomeQueryString">http://<some IP address>/SomeApp/Default.aspx?Something=SomethingElse
6. The RP cannot find any authentication cookies for this URL (as it has the IP address and not the domain name) and so it too, redirects the request to the STS for authentication, however – as its the same configuration, the STS URL, the realm and the reply address are the same as in Step 1, so the STS redirects straight back to http://MyDomain.com/SomeApp/Default.aspx?Something=SomethingElse
7. At the RP, as the URL is now the same as it was in Step 1, the FAM find the previous cookie set and so it ignores the SignInResponseMessge; this means that the redirect that would have happened to the original URL the user requested does not happen, and so it is the page at the reply address, without any query string, that gets processed
There is actually a much simpler way to reproduce the issue, albeit slightly less realistic (post testing) – if you access the RP via it’s IP address, but the reply address in the configuration is set using the domain name;
The cookie will be stored for the domain name, and so any subsequent attempts to access the RP, with query string parameter, will result in them being removed as the redirect at the FAM does not happen.
One of the items on my to-do list for a while now was to add support for single sign out in our passive scenarios; the idea is that if a user browses to several RPs, and then hits the sign out button on one of them, she would automatically be sign out of all the other RPs visited in this session.
Whilst, as you will see shortly, the framework has great support for this scenario, and it is easily achieved, it is not the out-of-the box behaviour; out of the box – if you’re using the SignInStatus control (with or without the FAM) and the FederatedPassiveTokenService control, when the user hits the sign-out button of the SignInStatus control, she will be signed out of the current RP, as well as the STS itself, but any other RPs the user had visited in this session will keep her logged in.
So – if the user browsed to application A, authenticated at the STS, and then browsed to application B, she is not signed on in both applications as well as on the STS; hitting the sign out button in application B will sign her out of application B as well as of the STS; if she tries to browse to application B now (with no browser caching), she will get redirected to the STS, and would need to re-authenticate there; same would happen if she tried to browse to any application other than application A, which is protected by the STS; within application A, however, the user would still be authenticated and she will be able to keep using this app.
In some cases this may be acceptable, but in our case the users assume that if they hit sign out, they are signed out of the entire “set”, and so we were set on achieving this behaviour.
It turns out that the framework has great support for this scenario, and that only very little code is required to achieve this; in fact – on the RP side – there’s nothing to do.
Both the FAM and the SignInStatus controls handle requests to sign off out of the box, all you have to do is send an HTTP request with “wa=wsignoutcleanup1.0” in the query string and the framework will take care of removing the local cookies; it will even return a nice image to indicate success (you can control which image to show through configuration);
To see this in action – create a standard scenario with two RPs configured to use a single STS; add to your STS an ASPX page, which would look something like this (you will need to update the urls to point at your RPs) -
<%@ Page Language="C#" AutoEventWireup="true" CodeBehind="SignOut.aspx.cs"
Inherits="HRG.Profile.Identity.STS.Web.Passive.SignOut" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" >
<body>
<form id="form1" runat="server">
<img src="https://localhost/ststests/testwebsite4/default.aspx?wa=wsignoutcleanup1.0" /><br />
<img src="https://localhost/ststests/testwebsite3/default.aspx?wa=wsignoutcleanup1.0" />
</form>
</body>
</html>
In the code behind add the following -
protected void Page_Load(object sender, EventArgs e)
{
FormsAuthentication.SignOut();
}
Now run through your scenario - login to one application, then browse to another, then browse to this test page; you should see a couple of “green ticks” indicating you have been signed out of both applications.
Now try to browse to either of them (make sure to refresh to pages to avoid browser caching) - you should notice that you’re no longer authenticated in neither (nor the on the STS) and that your’e redirected to the STS’ login page. cool!
So- we’ve proved that there’s really nothing to do on the RP side to achieve single sign out; but what do we need to do on the STS side? well – when the user hits the sign out link button on the SignInStatus control a request goes to the configured url for the STS, so this would be the entry point; what we really need to do is figure out a way to, for example, dynamically generate a page similar to our test page above; to do that we need to be able to a) track the RPs a user had visited and b) control the behaviour of the STS when the user hits sign-out on any RP, to make the required sign out requests to all the other RPs.
Until now I’ve been using the STS control (FederatedPassiveTokenService) in my passive STS, and so – to add behaviour required I would have to extend it, which is not something I felt comfortable doing; the alternative was to get rid of the control altogether and simply write the code required to handle both sign in and sign out from scratch, which is something I wanted to experiment with (in fact – I had to do much of it it in a different area of my solution – bridging single sign out protocols, but that’s for another post), so I though this is a good opportunity to give it a go.
As it turns out, as the framework has all the code to do the heavy lifting, all I needed to do is “control the flow”, and it was all relatively painless - I removed the controls from my page, and started to replace it by placing code in the code behind -
First – I needed to figure out what request I’ve received from the caller; this was as simple as two lines -
WSFederationMessage message = null;
bool messageCreated = WSFederationMessage.TryCreateFromUri(Request.Url, out message);
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }messageCreated now indicates whether the request to the STS was a valid one, message is expected to be either SignInRequestMessage or SignOutRequestMessage (there are two other possible request types that are not currently supported by the framework, but that’s for another day)
Before I’ll go back to my single sign out scenario, I need to complete the single sign in scenario as I no longer have the control on the page (I could, potentially, leave the control on the page and do nothing if the message was a sign in request – leaving the control to do all the work, and if the message was a single sign out request execute whatever code I needed to achieve that, but I wanted to get rid of the control so I implemented code for both paths)
So – to implement the single sign in my STS needed to call the STS, get a SignInResponse message and write that to the Http response stream, how is this done? well – there may be many favours, but the main code would look something like this (some elements removed for bravity) -
if (message is SignInRequestMessage)
{
SignInRequestMessage requestMessage = message as SignInRequestMessage;
// Create our STS backend
SecurityTokenService sts = new MySTS(stsConfig);
// Create the WS-Federation serializer to process the request and create the response
WSFederationSerializer federationSerializer = new WSFederationSerializer();
WSTrustSerializationContext serialisationContext = new WSTrustSerializationContext();
// Create RST from the request
RequestSecurityToken request = federationSerializer.CreateRequest(requestMessage, serialisationContext);
// Get RSTR from our STS backend,
//the thread's principal would not be an IClaimsPrincipal, so create one from the contained identity
IClaimsPrincipal principal = ClaimsPrincipal.CreateFromIdentity(Thread.CurrentPrincipal.Identity);
//issue the RSTR
RequestSecurityTokenResponse response = sts.Issue(principal, request);
// Create Response message from the RSTR
SignInResponseMessage response = new SignInResponseMessage(new Uri(response.ReplyTo),
response,
federationSerializer,
serialisationContext);
response.Write(Page.Response.Output);
Response.Flush();
Response.End();
}
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
I’m creating an instance of the STS for each request, but am re-using the sts configuration class (which I keep as an “Application” variable in the STS’ asp.net application).
I’m then using a federation serialiser to create the RST, run this through the STS (providing the principal, “converted” to a ClaimsPrincipal) and then create a SignInResponseMessage out of the RSTR returned by the STS;
Job done – my STS now supports single sign in without the control;
You can already imagine what I needed to do to complete the sign out support- to start with I needed to add an else-if to handle SignOutRequestMessage (as I’ve mentioned – there are other types of requests theoretically possible, but lets not worry about them at the moment) -
else if (message is SignOutRequestMessage)
{
}
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
The first thing I would do there, is sign out the user from the STS itslef -
FormsAuthentication.SignOut();.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
All I needed to do now is add bog standard ASP.net code to generate the required Http Get requests to all my RPs; but to do this I needed to keep track of which RPs a user had visited within the session, so I know which RP’s to sign her out of; to help me achieve that I’ve created the following class to track the user’s visited realms -
public class VisitedRealmsTracker
{
private Dictionary<string, string> visitedRealms = new Dictionary<string, string>();
public void Add(string sessionId, string realm)
{
string key = sessionId + "|" + realm;
lock (visitedRealms)
{
if (!visitedRealms.ContainsKey(key))
visitedRealms.Add(key, sessionId);
}
}
public IEnumerable<string> GetAllRealmsForSession(string sessionId)
{
//find all visited realms for this session and return the second part of they key (after the '|') which would be the realm
return from visitedRealm
in visitedRealms
where visitedRealm.Value == sessionId
select visitedRealm.Key.Split('|')[1];
}
public void ClearUserEntries(string sessionId)
{
lock (visitedRealms)
{
List<string> keys =
visitedRealms.Where(realm => realm.Value == sessionId)
.Select(realm=>realm.Key).ToList();
foreach (string key in keys)
visitedRealms.Remove(key);
}
}
}
I’ve added a member of this type to my STS Configuration class, which – you would remember – I now keep as an application variable, and so I could add a call to Add in the STS code handling the sign in request I showed earlier, which would ensure I’m tracking all the visited realms; the sign out logic could now iterate over the results of the GetAllRealmsForSession and handle the logout requests, simple code to achieve this could be something like -
foreach(string realm in stsConfig.Tracker.GetAllRealmsForSession(Session.SessionID))
{
//get realm configuration
ReliantPartyConfigurationElement rpConfig = Configuration.ReliantParties[realm];
//create an image pointing the at realm's signout url appending the signout and cleanup action
Image img = new Image();
img.ImageUrl = rpConfig.SignOutUrl.Trim()+"?wa=wsignoutcleanup1.0";
Repeater1.Controls.Add(img);
//add a line break after each realm
LiteralControl br = new LiteralControl("<br />");
Repeater1.Controls.Add(br);
}
This sample code simply creates the same images I’ve previously had hard coded in the test page dynamically.
With all the code in place – sign in requests are being processed by code instead of the control, with the code now customised to keep track of visited realms, sign-out requests use this tracked information to dynamically build a page that issues the required Http get request to all the visited RPs to sign the use out of all of them; single sign out achieved. easily.
I’ve blogged before (somewhat briefly, for a change) about my surprise when I learnt that URLs are [largely theoretically, in my view] case sensitive and the problem that this causes for a Geneva Framework based passive STS implementation.
In that post I mentioned a solution suggested by Peter Korn at the time – setting the path of the cookie to the domain root (‘/’) instead of the application path (including virtual directories), as, unlike the rest of the path, the domain name in a URL is not case sensitive, this works well, and I though it was “case closed”; until recently, when I’ve realised this solution has a very significant drawback - as the cookie, containing the authorisation token from the STS, is stored at the root of the domain, it will be served to every application under that domain, which is taking single-sign-on slightly too far :-)
Following this approach it is not possible allow access to one application and deny it from another (on the same domain) other than through claims processing in the applications themselves, which is a less secure approach from an architecture perspective); clearly not a good solution then…
So – I needed to go back to storing the cookie in the correct path, which would ensure that the STS is re-visited when trying to access a second application (even in the same domain), which – in turn – would mean that the user’s permissions are re-evaluated, before a second, application-specific, token is provided; with that - came back the problem of the URLs being case sensitive.
Thankfully, we’re now on the TAP program for the Geneva Framework, and we’re getting great support by the guys at Redmond (can’t thank them enough!), and after bringing up this issue in a discussion, Shiung Yong suggested another approach to solving this - overriding the GetReturnUrlFromResponse method in the WSFAM.
(Side track: The more I work with the Geneva Framework the more impressed I am with the extensibility options it provides, sure – it’s hard to figure those out on your own if you don’t know about them, and yeah – the resulting solution is often somewhat fragmented, with bits of code in several places, but that’s not much different from many other solutions in this space I suspect – you can see this with many WCF implementations – on the upside, however, if you’re willing to put the sweat, you can do pretty much everything (but yes – the continuum moves from adding a couple lines of code to re-writing the framework :-) )
To understand why and how Shiung’s solution works, consider the following scenario, describing the problem (and here’s where my description is bound to get somewhat confusing) -
Out of the box, the flow of circular redirects, when the URL in the browser is entered in the “wrong” casing, is as follows -
- The user types in the RP’s URL, let’s say - all in uppercase, into the browser
- As the http request to the RP does not contain an authentication token at this point, the FAM at the RP redirects it to the STS, providing the RP’s ‘realm’ to the STS (the ‘realm’ is configured at the RP and is intended to provide a unique URI to the STS, which it can use to identify the RP, and, for example, be used to load the relevant configuration such as which certificates to use when creating the token); the original URI the user had typed in is also provided through the query string (the ru property in wctx); optionally, and crucially, the RP may also provide the wreply query string parameter, based on its configuration; it is expected that the STS will forward the request, after authentication the user, to this address (but this is not mandated), this will become a key point shortly.
- Still at the STS the user authenticates (generally using a login screen), and the STS redirects the request, with a ‘sign in response’ message containing an authentication token back to the RP; as mentioned before it is expected that this would be the address provided by the wreply (and this would be the default behaviour provided by the framework, but this can be easily overridden in the STS’ implementation); for this example, lets assume that the configured value, echoed in the wreply property is set to be in lowercase (remember – the user typed in the URL in the browser in uppercase).
- The redirect request contains the set-cookie instructions with the token from the STS and so the browser would set the required cookies in the address the STS redirected to - the lowercase address.
- In the step that would follow, the FAM does its sign-in ‘magic’, which concludes by redirecting the request to the URL set in the STS’ response message through the ru field in the query string - this is the URL the user typed into the browser initially, kept by the RP and then the STS - which is all uppercase
- At this point, FAM is called again for the new request, attempting to extract the authentication cookie, but as the cookie was stored on the URL the STS redirected to – which was lowercase - and the browser is now using the URL the user typed initially – which is all uppercase - the cookie is not served by the browser, and thus not found in the server code, and the user is being redirected back to the STS as if this was the first call;
- As the request arrives to the STS with the uppercase url again, the above would happen again and again in an endless cycle.
Confused? hopefully not too much…but to summarise - out of the box, if the two (the URL configured as the reply to address in the RP, or any other URL the STS uses to redirect back to the RP) and the URL typed into the browser by the user) are not [case-sensitive] identical, the cookie will be set, but subsequently not found when attempted to be read and thus authentication at the RP would continuously fail.
In comes Shiung’s solution -
As long as there’s a convention in the implementation as to the correct form of the URLs (or if drowning in more configuration is acceptable) the FAM could be extended to over come this -
Step 5 above mentions the FAM has some ‘magic’ authentication work with a redirect in the end; the built in implementation uses the ru field to obtain the address to redirect to, but there’s a good extension point there in the form of the GetUrlFromResponse method of the FAM which is called to obtain the url; by overriding this function you can provide whatever logic you wish to control the URL the FAM would redirect the request to after authenticating the request.
Lets say we can agree (as we have) that all reply to addresses will always be configured in lowercase (whilst we can’t control user behaviour, we can control our own configuration), with that agreed we can override the GetUrlFromResponse to always convert the ru value to lowercase before returning it to the bulit in functionality – here’s my version of the method, as suggested by Shiung -
public class CaseInsensitiveFAM : Microsoft.IdentityModel.Web.WSFederationAuthenticationModule
{
protected override string GetReturnUrlFromResponse(System.Web.HttpRequest request)
{
string returnUrl = base.GetReturnUrlFromResponse(request);
return returnUrl.ToLower();
}
}
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
(it is important, of course, to remember to configure the RP to use your custom FAM and not the build-in one -
<!--<add name="WSFederationAuthenticationModule"
type="Microsoft.IdentityModel.Web.WSFederationAuthenticationModule,
Microsoft.IdentityModel, Version=0.6.1.0, Culture=neutral,
PublicKeyToken=31bf3856ad364e35"/>-->
<add name="CaseInsensitiveFAM" type="CaseInsensitiveFAM, Utilities"/>
)
What had just happened?
- By convention, we ensured the RP provided a lowercase reply to address to the STS.
- The STS uses this (lowercase) address to forward the request containing the authentication token, and this is where the cookies will be set.
- The FAM uses GetUrlFromResponse to retrieve the URL to redirect to, my customised version ensures this would always be lowercase, aligned with the RP configuration
- The browser is redirected, again to the lowercase address, this time to receive the cookies set in step 2 which means the request is now authenticated and the user is let in; no more circular redirects!
Of course I’ve implemented a hardcoded rule (always lowercase), but you could use configuration, investigate the http request message or any other logic you’d like…
Some issues remain with that approach (that I can think of) -
If, at this point, the user goes and types the URL in a different casing, as the cookie already exists and the FAM code will not execute again, the user will get redirected to the STS for authentication, but that’s fair enough – I don’t know of any user that would do that in real life..and the result (requiring re-authenticaiton) is quite acceptable
The other thing is that this solution would break should an application be case sensitive (for query string parameters, for example), but we don’t have that problem, and it could be handled by more sophisticated code in the custom FAM, so that’s ok as well.
I suspect this is not the clearest post I’ve ever published (but, unfortunately, probably not the worst), so I can only hope someone will manage to make sense of it and find it useful; I’m pretty sure I’ll need it for future reference; there’s no chance I’m remembering all of this!
Unfortunately life is a bit hectic for us at the moment, and it’s been a while since I’ve posted anything, or was able to do anything other than family or work.
As part of this general “neglect” I was unable to spend the time required to complete, and publish, my Oslo based solution for deployment BizTalk applications (given the somewhat ugly, but suitably short, name – BTSDF), but as many waited (ok – more than 5), and some kept asking (2), I thought I’d do my best to get something out of the door, so I have.
in Codeplex you can now find an initial version published, which includes – the language definition, the core runtime I have as well as two “executors” in various stages of [non] completeness – my MsBuild generator is already quite useful for simple apps, it generates a set of MSbuild files you can use to deploy the application on any machine as well as the required SDC tasks dlls and targets file; the API executor deploys the app on the current machine, and is quite basic, but a good sample (I think) and a reasonable starting point.
I’m happy to entertain requests for changes/additions, and even more happy to add any one who’s willing and able to put the time as a contributor; Oslo knowledge is optional! :-)
You can download the source, build and run it locally – but you will have to remember to copy the two executors dlls from their folders to the main bin\debug folder as there is no compiled reference to them.
I’ve also uploaded an “Alpha release” which includes the compiled assemblies and the supporting files required.
There’s still more work to do, but it’s getting shape now, and I’m using the MSbuild executor today; it certainly needs a bit more documentation, which I hope to get around to…
I hope this works for you, let me know what you think (the good and the bad), and – if you have some time and will – drop me a line and I’ll add you to the team!
Yossi
Briefly back on my STS work -
Our STS implementation can already replace the authentication implementation of most of our applications; naturally we can’t do that just yet, given that the Geneva-framework has not been released yet, but all of my tests are quite positive so we’re just waiting for the opportunity to start using it.
However, so far, we were not in a position to replace the authorisation mechanism, not easily anyway, and that’s something that was on my list for some time now.
The STS provides a list of claims, which the applications can relatively easily access via code, as many samples show, and this proves very useful; application can investigate various claims about a user and drive their functionality from that.
It does mean, though, that the applications need to change to support this new claims based mode for authorisation, which is not something we can just assume we would be able to do; as a start, we just want to achieve an in-place replacement for our current authorisation logic.
Most of our web apps currently use ASP.net membership and roles and so they extensively use ‘IsInRole’ checks to figure out user authorisation and drive the application behaviour, to start with, we had to hook to that mechanism.
Luckily the Geveva framework has a relatively good support for exactly this need - , out of the box, it would convert any claims of the Microsoft role namespace (‘http://schemas.microsoft.com/ws/2008/06/identity/claims/role’) to roles; so – if a token included a claim of this type with a value of ‘Manager’, a call to HttpContext.Current.User.IsInRole(“Manager”) would return true.
And so I made sure my STS adds any roles with the correct claim type, very easy.
However – this is very Microsoft centric. what about all those claims that come from systems that don’t follow Microsoft’s approach? (how dare they!) ? and what about us wanting to have our own claims, using our own types, some matching roles (while others may not…) -
Well – we needed a way to map any claims to ms-role claims before the Geneva framework does its bit.
As is often the case - Dominick Baier was most helpful in posting on exactly that, and so, following his example, I created my RoleClaimsMapper -
public class RoleClaimMapper : ClaimsAuthenticationManager
{
public override IClaimsPrincipal Authenticate(string endpointUri, IClaimsPrincipal incomingPrincipal)
{
//load configuration section for component
RoleClaimsMapperConfigurationSection config =
(RoleClaimsMapperConfigurationSection)ConfigurationManager.GetSection("RoleClaimsMapper");
//create a collection of claim types and populate from configuratoin
List<string> claimsToMap = new List<string>(config.RoleClaims.Count);
foreach (RoleClaimConfigurationElement claimElement in config.RoleClaims)
claimsToMap.Add(claimElement.ClaimType);
//loop on all identities, we really only expect one, but can easily support multiple.
foreach (IClaimsIdentity identity in incomingPrincipal.Identities)
{
//extract the claims that we need to map (matching the configured list of claims)
IEnumerable<Claim> roleClaims =
identity.Claims.Where<Claim>(c => claimsToMap.Contains(c.ClaimType));
//now create a role claim (using the MS role claim type) for each claim found;
//need to keep this outside claim loop so we don't modify the collection while iterating
List<Claim> claimsToAdd = new List<Claim>(roleClaims.Count());
foreach (Claim claim in roleClaims)
claimsToAdd.Add(new Claim(Microsoft.IdentityModel.Claims.ClaimTypes.Role, claim.Value,claim.ValueType,"local",claim.Issuer));
//add new claims to current identity
identity.Claims.AddRange(claimsToAdd);
}
return incomingPrincipal;
}
}
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
I then configured my authentication manager with the framework -
<microsoft.identityModel>
<claimsAuthenticationManager type="RoleClaimMapper,Identity.Utilities" />
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
and added my bit of custom configuration
<RoleClaimsMapper>
<RoleClaims>
<add Type="http://someDomain.com/identity/claims/SomeRole"/>
<add Type="http://someDomain.com/identity/claims/AnotherRole"/>
</RoleClaims>
</RoleClaimsMapper>
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
As you can see this code would take a list of claim types from configuration, and map all claims of these types to roles, adding them to the identity’s claims collection using the required claim type (leaving the original claim intact), and voila – when the app executes it can check the roles, corresponding to the values supplied in my custom claims using -
HttpContext.Current.User.IsInRole(“[custom claim value"]”);Very nice indeed!
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
UPDATE: Dominic’s comment on this post (on http://geekswithblogs.net/Connected/archive/2009/04/01/roleclaimsmapper-for-the-geneva-framework.aspx#453535) had indirectly suggested an even cleaner solution; instead of duplicating the claims, I can add the claim types I have as roles to each identity’s RoleClaimTypes collection; this achieves the same result in a much cleaner way, here is the updated function -
public override Microsoft.IdentityModel.Claims.IClaimsPrincipal Authenticate(string endpointUri, Microsoft.IdentityModel.Claims.IClaimsPrincipal incomingPrincipal)
{
//load configuration section for component
RoleClaimsMapperConfigurationSection config =
(RoleClaimsMapperConfigurationSection)ConfigurationManager.GetSection("RoleClaimsMapper");
//loop on all identities, we really only expect one, but can easily support multiple.
foreach (IClaimsIdentity identity in incomingPrincipal.Identities)
//for each identity, add all the claim types that are role claim types.
foreach (RoleClaimConfigurationElement claimElement in config.RoleClaims)
identity.RoleClaimTypes.Add(claimElement.ClaimType);
return incomingPrincipal;
}
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
This is a third post in a series describing my Oslo based solution for deploying BizTalk applications; I’ve used this exercise to play around with ‘M’, but it was important for me to work on a real solution, with real benefits – something I could actually use…in Part I I discussed the concept and presented both the “source code” of my app and the output I was working toward; Part II was all about the MGrammar part of the solution.
In this, third, part I will discuss the last missing piece –the runtime.
Before I start, though, I would say that I did find getting into ‘M’ somewhat confusing at first; and while it’s more than just possible I’m still missing some things , I hope this series could help one or two people in their journey with Oslo – which is, without a doubt, an exciting one!
There are two things, I believe, that contributed to my confusion - the first is the fact the M is really many things, quite different things, actually; from what I hear Microsoft have identified the challenge some of us (me) are having getting a grasp on ‘M’ and are hard at work to bring things [closer] together; hopefully it won’t be long before we know how the converged language looks like, in the mean time one simply has to remember that -
There’s MSchema - which you could use to define models, a bit like xml-schema, or declaring your classes in code or even tables in SQL; I haven’t really touched on MSchema in this series butI might come back to that later.
Then there’s MGraph, which is a way to define instances of things, possibly ones that have been modelled using MSchema, but, as is evident from my little project, not necessarily - MGraph can be very useful even if you don’t have a model- as long as you have your grammar – in comes MGrammar, the third spect of ‘M’, which can be used to define a syntax for your very own [domain-sepcific-]languge for describing things;
A ‘runtime’ could then be used to processes instances described as MGrammar as a result of inputs in your language.
And that is the second thing that really confused me – what is that ‘runtime’? in all the ‘M’ presentations I’ve seen, the ‘runtime’ was merely mentioned and has never received enough “floor space” and yet – an MGrammar without a runtime, in the majority of cases, is, quite useless; you have to have a runtime that would act on your source code; in fact – the runtime would act on the MGraph resulting from your language, which is what makes it all so brilliant, because in a sense, this is where everything comes together – you runtime can work on instances described in your language, on MGraph instances stored in the repository created using MSchema and possibly even ones defined using Quadrant.
The point is that there must be a runtime that understand the model behind your language , can parse its graph and then do whatever you need it to do; and it is your job to build that runtime.
So what have I done for my runtime? here’s a quick overview (reminder: the full source code will find its way shortly onto codeplex) -
My runtime is a console application, one that takes a source code file path as an argument and outputs MSBuild files (and dependencies) that can be used to deploy the application described in the source code onto BizTalk Server.
The first part of my runtime - which I will not bore you with - is about validating the command line arguments; standard stuff.
The second part is about creating the parser for my language, where, thankfully, the Oslo SDK does all of the heavy lifting – it includes a class called DynamicParser which, once created, you can use to parse your source code.
To create the DynamicParser you must first compile your language, and that’s easy enough to do – you start by creating a compiler
MGrammarCompiler compiler = new MGrammarCompiler();
and continue by supplying your grammar
compiler.SourceItems = new SourceItem[] {
new SourceItem {
Name="BTSDeploy",
ContentType = ContentType.Mg,
TextReader = new StreamReader(GetLanguageDefinition())
}
};
(GetLanguageDefinition() is a simple helper method I wrote to get the grammar file embedded as a resource in the exe)
Now you’re ready to compile your language, but to make things manageable you want to provide it with an error reported; the compiler would report any errors to the stream you would provide, I’ve naturally used the console
TextWriterReporter errorReporter = new TextWriterReporter(Console.Out);
if (compiler.Compile(errorReporter) != 0 || errorReporter.HasErrors)
{
Log("Failed to compile language definition\nSee above for details");
return null;
}
If the compilation succeeded you are ready to create your parser -
DynamicParser parser = new DynamicParser();
compiler.LoadDynamicParser(parser);
That’s part one of three done.
The next step is to use the dynamic parser to parse your source code, the output of which would be a graph representation of the source; luckily the SDK does virtually all the lifting here as well, and it comes down to one line -
object rootNode = parser.Parse<object>(sourceCodeFileName, null, errorReporter);
Note that the output type is object – which, as you will find out if you try this out, is quite painful– currently all the types used in the Graph are internal, which makes debugging quite difficult (you can’t quite look at any variables you hold in any meaningful way, you have to keep calling methods, as you’ll see next; hopefully this will change one of the next updates to the SDK.
In any case rootNode is now pointing at the root of a graph – a tree like structure you could ‘walk’ to extract the pieces of information you care about in the source code; here you’re expected to use methods like GetLabel, GetSequenceElements and GetSuccessors to reach nodes and their values in the graph and, of course, to do that you need to know exactly how your graph looks like; my first instinct was to look at the PreviewMode pane in intellipad (usually the right most pane when working with MGrammar) as it shows you a representation of the MGraph created for the source code and language used; this worked quite well, but, as I found out, wasn’t the most trivial thing – the two didn’t align completely and I ended up having to resort to trail-and-error to get the parsing logic right.
The reason is that M has a few shortcuts one could take, but the graph you would be working on is the very basic, more verbose format; some information on this is mentioned here.
Then, on a recent visit to Redmond, Dana Kaufman passed on a great tip – if you ‘compile’ your grammar using mg.exe to create the mgx file (basically a ZIP file containing XAML representation of language) and then use mgx.exe on your source file adding a reference to the mgx file you just created, you end up with an ‘M’ file which is exactly the graph your runtime would be working on.end up with; so useful!
So – here are a few example of how I worked the graph – to start with I knew my root node should be a node with the label ‘Application’, so I checked it this way -
string label = graph.GetLabel(rootNode).ToString();
if (label != "Application")
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
I then knew that the application name would be a child element of the root node, so I extracted it like this
// extract the application's data - this should contain two nodes - the application name and the list of items in the application
List<object> appData = graph.GetSequenceElements(rootNode).ToList<object>();
//first line should be the application name, make sure it is not a node and extract the label
if (!graph.IsNode(appData[0]))
Contents.AppName = appData[0].ToString();
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
the second node in the appData collection is where the graph ‘continues’, so to get the list of things that compose my application I needed to walk down that path -
//the second element should be the list of lines
foreach (object section in graph.GetSuccessors(appData[i]))
{
//each successor would be a category (reference, importing binding, resource, etc), with a list of items
processSection(graph,section);
}
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }with processSection start with
string sectionName = graph.GetSequenceLabel(graph.GetSuccessors(section).First()).ToString();
List<object> items = graph.GetSuccessors(graph.GetSuccessors(section).First()).ToList<object>();
Log("Found section '{0}'", sectionName);
switch (sectionName)
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
I hope that from these few examples you can see what it takes to work the graph – the graphBuilder (which is a somewhat confusing name, as I’m using it to walk the graph, not build it) has all the methods you need to access the various nodes ( but there’s no xpath-like- support), but as all the types are (currently) internal to the MS assembly you’re always working with objects, which is less then ideal.
Again – my full source code is on its way to codeplex, I just want to make sure it’s commented well enough to be well understood, and am struggling with time, but the bottom line is that once you figure out how the graph builder works, learnt how to see your graph visually (using mg.exe and mgx.exe) and got used to the fact that you’re dealing with objects for now, parsing the source code is very easy.
Obviously it is completely down to you what you then do with all the information you’ve extracted from the source code; in my case my runtime is using a plug-in model so the first part is all about using the Oslo SDK to get an instance of a BizTalkDeployment class populated based on the contents of the input file, this class looks like -
public class BizTalkDeployment
{
public string AppName { get; set; }
public List<object> References { get; set; }
public List<object> Build { get; set; }
public List<BizTalkAssembly> BizTalkAssemblies { get; set; }
public List<object> ImportBindings { get; set; }
public List<Binding> AddBindings { get; set; }
public List<Assembly> Resources { get; set; }
}
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }I then use late binding and configuration to load a plug in that would take an instance of this class and do the work, be it generation of msbuild scripts, deploying to the local machine using BTSTask or anything else.
A few weeks back I worked on a process that looked something like this -
It was triggered by the scheduled task adapter and then used a SQL send port to call SP to return list of ‘things’.
It needed to split the things in the list to individual records, and to start a new, different, process, through pub/sub (to avoid the binary dependency with the called process), for each ‘thing’.
Fairly simple.
A lot of have been said on the different ways to split messages, I won’t repeat this discussion here; I would just say that initially I used a different approach – I used the SQL adapter in the initial, triggering, receive port and then used a receive pipeline, with an XmlDisassembler component, to split the incoming message so that each record was published individually thus avoiding the need to have a ‘master process’; that back fired though, in my case – I quickly realised I’ll be choking the server with the amount of messages published and needed a way to throttle the execution; I’ve played a bit with host throttling but then came to the conclusion the best approach for me would be to throttle in a process, which is what I’ve done.
And so - to make things interesting, and because I already had it all ready - I decided to use a call to a pipeline from my process to split the message.
The first thing I realised, trying to take that approach, was that I had to change type of the response message received from the SQL port to be XmlDocument (which is an approach I generally dislike – I’m a sucker for strongly-typed-everything) – but my schema was configured as an envelope so that when I call the pipeline from my process it knows how to split it correctly, but, when used in the SQL port BizTalk split the message too early for me – I needed to whole message in the process first, which was no good to me; if , however, I removed the envelope definition from the schema when I would call the pipeline directly from my process it won’t know how to split the message, which is no good either; nor could i have two schemas (BizTalk, as we all know, dones’t like that bit at all, not without even more configuration); XmlDocument it is.
It then came back to me (in the form of a compile time error :-)) that the pipeline variable has to exist in an atomic scope, and so I added one to contain my pipeline variable; I then added the necessary loop with the condition set to the GetNext() method of the pipeline and in each iteration constructed a message using the GetCurrent() method; all standard stuff.
I would then set some context properties to route my message correctly and allow me to correlate the responses (I used a scatter-gather pattern in my master process) and published it to the message box
What I noticed when testing my shiny new process was that all those sub-processes that were meant to start as a result the published messages in my loop were delayed by quite a few minutes (6-8), which seemed completely unreasonable, so I embarked on a troubleshooting exercise which resulting in that big “I should have thought of that!” moment.
While the send shape in my loop successfully completed its act of publishing the message in each iteration, moving my loop to the next message and so on, being in an atomic scope BizTalk would not commit the newly published messages to the message box database, allowing subscriptions to kick in, before the atomic scope would finish; that is to allow it to rollback should something in the atomic scope would fail.
What it meant for me though, was that all the messages were still effectively published at once, which brought me back to square one (or, minus one, actually, considering that the great delay caused my this approach means I’m even worse off from my first debatch-in-pipeline approach).
And so I went back to the old and familiar approach of splitting the messages using xpath in the process, which allowed me to carefully control the publishing rate of messages for my process and throttle them as needed.
We’ve been slowly migrating our services from asmx to WCF, but as we’re still using BizTalk 2006 with no support for WCF we’ve been exposing endpoints configured for basicHttpBinding and consume them using the SOAP adapter.
Generally speaking things have been going well, although we completely gave up on the idea of moving the services to WCF and NOT have to change the client, until yesterday we’ve stumbled into a serialisation issue –
The SOAP adapter, as part of its work deserialises the request message arriving through the send port into t he web service proxy class it generated, before calling the web service (which would result in the class now being serialised back into xml, which is another story); that deserialisation failed.
The error message was clear enough and indicated it failed to deserialise an enum parameter the service was expecting, and that ran a bell – I posted on exactly that back in September, but after carefully checking and re-checking everything we could swear that our message (which was now suspended) matches perfectly the schema generated by the add web reference wizard; what’s going on then??
After chasing our tail for a short while we brought up reflector to the rescue and found out the cause of our woe is a combination of a difference in behaviour between WCF and ASMX and the use of BizTalk – here are the details –
Consider the following asmx web method –
[WebMethod]
public string GetDataUsingDataContract(CompositeType.someEnum myEnum)
{
return "Hello World";
}
With CompositeType being
public class CompositeType
{
public enum someEnum
{
Value1,
Value2
}
}
(..and pretend CompositeType has many more things, but these are irrelevant to this topic)
The definition for myEnum in the WSDL looks like
<s:element minOccurs="1" maxOccurs="1" name="myEnum" type="tns:someEnum" />
Where the type tns:someEnum looks like
<s:simpleType name="someEnum">
<s:restriction base="s:string">
<s:enumeration value="Value1" />
<s:enumeration value="Value2" />
</s:restriction>
</s:simpleType>
As a result the definition of the enum in a proxy generated via the add web reference VS 2005 option (which is what BizTalk would use) looks like –
[System.CodeDom.Compiler.GeneratedCodeAttribute("System.Xml", "2.0.50727.3053")]
[System.SerializableAttribute()]
[System.Xml.Serialization.XmlTypeAttribute(Namespace="http://tempuri.org/")]
public enum someEnum
{
Value1,
Value2,
}
All makes sense.
Now, let’s look at what WCF does in the same case; consider the following service –
[ServiceContract]
public interface IService1
{
[OperationContract]
string GetDataUsingDataContract(CompositeType.someEnum myEnum);
}
[DataContract]
public class CompositeType
{
public enum someEnum
{
Value1,
Value2
}
}
The WSDL generated looks like
<xs:simpleType name="CompositeType.someEnum">
<xs:restriction base="xs:string">
<xs:enumeration value="Value1" />
<xs:enumeration value="Value2" />
</xs:restriction>
</xs:simpleType>
<xs:element name="CompositeType.someEnum" nillable="true" type="tns:CompositeType.someEnum" />
The key difference is that the name of the class containing the enum has made it into the type name for the enum, which never happened in the ASMX version.
As a result the proxy is generated as such -
[System.CodeDom.Compiler.GeneratedCodeAttribute("System.Runtime.Serialization", "3.0.0.0")]
[System.Runtime.Serialization.DataContractAttribute(Name="CompositeType.someEnum",
Namespace="http://schemas.datacontract.org/2004/07/WcfService1")]
public enum CompositeTypesomeEnum : int
{
[System.Runtime.Serialization.EnumMemberAttribute()]
Value1 = 0,
[System.Runtime.Serialization.EnumMemberAttribute()]
Value2 = 1,
}
Again – note the name given to the element now contains the class name and, crucially, a dot (‘.’).
On it’s own – nothing to malicious – although it’s another nail in the coffin for the idea that you can substitute web service with WCF service, configured them to use basicHttpBinding and all should be the same (ok – am I the only one still wishing this was possible?)
Enters BizTalk.
When you use the add web reference wizard to add a reference to the WCF service, BizTalk generates all the schemas and proxy for you, which is what you would use to create requests going to the service (and process responses).
Because the WSDL of the WCF service contains the longer name of the enum (with the class name, the dot and the enum name) the .net proxy generated is identical to the one created for the WCF service above; the schema, however, is generated incorrectly!
BizTalk “kindly” decides that having dots in the element name is not a good idea and removes it so the schema generated looks like this –
<xs:schema xmlns:tns="http://schemas.datacontract.org/2004/07/WcfService1" elementFormDefault="qualified"
targetNamespace="http://schemas.datacontract.org/2004/07/WcfService1"
xmlns:xs="">http://www.w3.org/2001/XMLSchema">
<xs:element name="CompositeTypesomeEnum" type="tns:CompositeTypesomeEnum" />
<xs:simpleType name="CompositeTypesomeEnum">
<xs:restriction base="xs:string">
<xs:enumeration value="Value1" />
<xs:enumeration value="Value2" />
</xs:restriction>
</xs:simpleType>
</xs:schema>
“CompositeTypesomeEnum”??????
Well, we’ve seen this, and created a message with exactly that element, which – of course – the SOAP adapter failed to deserialise into
[System.Runtime.Serialization.DataContractAttribute(Name="CompositeType.someEnum",
Namespace="http://schemas.datacontract.org/2004/07/WcfService1")]
public enum CompositeTypesomeEnum : int
{
[System.Runtime.Serialization.EnumMemberAttribute()]
Value1 = 0,
[System.Runtime.Serialization.EnumMemberAttribute()]
Value2 = 1,
}
The solution was fairly simple – we’ve simple change our xsl to put the element name as the .net proxy requires it, and not as the schema describes it, and it all worked well.
From the Microsoft Knowledgebase -
The <xsl:import> element is used to import an external XSLT file. The <xsl:include> element is used to include an external XSLT file. You cannot use these elements in custom XSLT files that are defined in the Custom XSL Path Grid Properties in a BizTalk project. You cannot do this because a Uniform Resource Identifier (URI) that is defined in an external XSLT file may be from a nonsecure source such as the Internet.
Am I (well, and Ben Gimblett here) the only one who thinks this is a lame excuse? since when MS tries to protect developers from stupidity? and in any case, if they really wanted to do that – wouldn’t they have to prevent us from writing ANY code?
A couple of weeks ago I published a post describing my Oslo based deployment framework for BizTalk.
Two parts were missing from that post – the actual MGrammar and the runtime that processes the source code files.
In this post I will go over the grammar I created for the framework; I will try to go over the complete grammar explaining the various steps, this is not intended to be a complete description of MGrammar (not that there’s a chance I could write one), but rather an overview by example; for more information on Oslo and ‘M’ visit the Oslo Dev Centre on MSDN
It was important for me to create a solution that is completely usable, and indeed I have started to use this to generate the build scripts for my application, the price of which is that it might not be the best example code out there, but I hope you will find this useful.
Below is the complete grammar, after which I walk though it step by step; it might be useful to have anther look at the example source code I included in my previous post to better understand what I’m trying to achieve with the syntax -
module Sabra.BizTalk.Deployment
{
language BTSDeploy
{
//main syntax is the entry point for the grammar - the first syntax to be parsed
syntax Main = app:AppDef
"{"
items:ApplicationItems
"}" => Application[app,valuesof(items)];
//application definition at root of source code
syntax AppDef = applicationKW name:ApplicationName => name;
//application items supports including all possible items multiple times in any order
syntax ApplicationItems = items:(Add | Build | ImportBinding | Comment)* => {valuesof(items)};
//now define the syntax for each item type -
//'Build' deinfes a solution or project to build during execution
//(due to limitations in our msbuild framework, runtime currently supports solutions only, but language should support both)
syntax Build = "build" path:Path";" => Build[path];
//binding to import into application
syntax ImportBinding = importKW bindingKW path:Path";" => ImportBinding{Path = path};
//syntax of add further specified different add 'options'
syntax Add = addKW add:(Add_Reference | Add_Binding | Add_Assembly | Add_BTS_Assembly) => Add{valuesof(add)};
//each add option is defined next
//binding to add as resource to application. must specify environment name
syntax Add_Binding = bindingKW path:Path env:MultiWordName";" => Binding[path,env];
//defined a reference to another application, supports providing multiple applications in the same instruction
syntax Add_Reference = referenceKW ref1:ApplicationName refs:Add_AdditionalReferences*";" => Reference{ref1,valuesof(refs)};
syntax Add_AdditionalReferences = "," app:ApplicationName => app;
//add assembly defines an assembly to be added as a resource to the application
syntax Add_Assembly = assemblyKW path:Path details:AssemblyDetails";" => Resource[path,Details{details}];
//add biztalk assembly is similar to assembly, but allows specifiying any contained orchestrations
syntax Add_BTS_Assembly = "biztalk" assemblyKW path:Path orch:Orchestrations? details:AssemblyDetails";" => BizTalkAssembly[path,orch,Details{details}];
syntax Orchestrations = withKW orchestrationsKW "{" type1:ApplicationName types:AdditionalOrchestrations* "}" => Orchestrations{type1,valuesof(types)};
syntax AdditionalOrchestrations = "," type:ApplicationName => type;
//assembly details
syntax AssemblyDetails = ver:AssemblyVersion+ culture:Culture+ pkt:PublicKeyToken+=>{Version{valuesof(ver)},Culture{valuesof(culture)},PublicKeyToken{valuesof(pkt)}};
token AssemblyVersion = versionKW "=" version:(AnyDigit*"."AnyDigit*"."AnyDigit*"."AnyDigit*)=>version;
token Culture = cultureKW "=" culture:Word=>culture;
//TODO: token should be 16 chars exactly
token PublicKeyToken = publicKeyTokenKW "="pkt:(AnyChar|AnyDigit)*=>pkt;
//keywords
@{Classification["Keyword"]}
token applicationKW = "Application";
@{Classification["Keyword"]}
token addKW = "add";
@{Classification["Keyword"]}
token bindingKW = "binding";
@{Classification["Keyword"]}
token referenceKW = "reference";
@{Classification["Keyword"]}
token importKW = "import";
@{Classification["Keyword"]}
token buildKW = "build";
@{Classification["Keyword"]}
token assemblyKW = "assembly";
@{Classification["Keyword"]}
token biztalkKW = "biztalk";
@{Classification["Keyword"]}
token withKW = "with";
@{Classification["Keyword"]}
token orchestrationsKW = "orchestrations";
@{Classification["Keyword"]}
token versionKW = "version";
@{Classification["Keyword"]}
token cultureKW = "culture";
@{Classification["Keyword"]}
token publicKeyTokenKW = "publicKeyToken";
//definition of a comment, similar to c# syntax
@{Classification["Comment"]}
token Comment = "//" CommentLineContent*;
token CommentLineContent
= ^(
'\u000A' // New Line
| '\u000D' // Carriage Return
| '\u0085' // Next Line
| '\u2028' // Line Separator
| '\u2029' // Paragraph Separator
);
//application name must start with a character and then include any character, digit or '.'
@{Classification["String"]}
token ApplicationName = AnyChar+(AnyChar | AnyDigit | ".")*;
//tokens use for definition of a file path
token Path = "\""PathRoot?FileSystemName("\\"FileSystemName)*"\"";
token PathRoot = AnyChar":\\";
token FileSystemName = (AnyChar | AnyDigit | Space | "-" | "_" | ".")+;
//common token definitions
token AnyChar = "A".."Z" | "a".."z";
token AnyDigit = "0".."9";
token MultiWordName = "\""a:(Word | Space)"\"" => a;
token Word = (AnyChar | AnyDigit | "-" | "_")+;
//the interleave will ensure the language allows whitespace anywhere
interleave Whitespace = Tab | LF | CR | Space | Comment;
token LF = "\u000A";
token CR = "\u000D";
token Space = "\u0020";
token Tab = "\u0009";
}
}
I’ve build the grammar top down and this is how I will walk through it -
First (1) I define my module, in a namespace like manner; this is the logical container for my language; I then(4) declare my language and give it a name.
The main constructs in mgrammar are syntax and token; I often heard the guys at Redmond explain that when it comes to languages you can think of syntax as being the sentence and tokens as being the words; I think this is a very clear explanations; there are a few rules relating to them as you can imagine, important ones to remember at this point are that syntaxes can contain other syntaxes as well as tokens (and literals), tokens can only contain other tokens (and literals); also – interleave does not apply to tokens .
The main syntax, and the entry point for any language is Main and you can see mine defined on line 7, and it looks like this -
//main syntax is the entry point for the grammar - the first syntax to be parsed
syntax Main = app:AppDef
"{"
items:ApplicationItems
"}" => Application[app,valuesof(items)];
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
‘//’ is used for comments, just like in c#, so the first line will be ignored.
syntax is one of the few keywords that exist in mgrammar, no explanation needed;
Main is the name of the syntax, which allows is to be referred to (used) by other syntaxes; in this case, as I’ve mentioned, ‘Main’ is also the entry point –the syntax the parser will start at; everything else should flow from here.
Now for - app:AppDef, but first – a note - in my mind there are two aspects to creating a language in mgrammar – there is the ‘parsing aspect’ – you define your language so that it describes the rules to parse your source code; and there is the ‘output aspect’ (or ‘production aspect’) – this is where you define the output of your language – in ‘M’ this is mgraph – so that it descries accurately the intent of any source code (and is easy-ish to work with at runtime)
Two are inevitably very mixed in any real-world work with ‘M’ which can be confusing, and today I want to focus on the parsing aspect – firstly because it is the more important one in my view (there’s nothing to work with before you’ve declared a good syntax for your language), and secondly – because I suspect we’re going to see some changes to the production aspect in the near future.
Mainly production aspect ‘stuff’ is defined after the ‘=>’ operator as you can see in my syntax above, so for the time being just try to ignore that; there will be a bit more to ignore as you will see shortly.
AppDef is a name of a syntax declared somewhere else in the language ( line 13); it could also be defined in any imported languages, but I don’t have any, we will look at that in a second; app: is an alias assigned to this syntax which allows for it to be referenced in the production on the right side of the arrow operator; again – for the time being feel free to ignore any aliases, they have no impact on the parsing aspect.
So – my syntax main basically says we’re expecting to have in our source code something that matches the AppDef syntax, then an opening curly bracket then something that matches the ApplicationItems syntax and then closing curly bracket. simple.
Of course next the parser would look at the definition of AppDef and ApplicationItems, and so will we.
AppDef is defined in line 13 as an ApplicationKW followed by ApplicationName; these are defined in lines 51 and 90 respectively; lets look at the ApplicationKW definition-
@{Classification["Keyword"]}
token applicationKW = "Application"; .csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
ApplicationKW itself is a token with a fixed literal 'Application’ – this is a very simple rule to follow, and in fact I could have simply included this literal in the syntax definition and not use this token at all (which is what I have done previously).
The reason I have separated it out to its own token is related to the preceding line in the grammar - the classification attribute allows me to mark this token as a “keyword” for my language, this would tell intellipad (and, presumably, any other editor that would learn how to work with mgrammar), that this token is a keyword and should be displayed as such; in intellipad this means it would be bolded in the editor, as you can see in the image below of my language open in intellipad -
Back to the main syntax’ components - ApplicationName, defined in line 51 states that an application name is composed of AnyChar followed by any number of AnyChar, AnyDigit or the literal ‘.’ with AnyChar and AnyDigit defined in lines 98 and 99.
The ‘+’ sign indicates the syntax or token it follows must exist at least once; the ‘*’ sign indicates the syntax or token it follows can exist 0 or more times.
So – we have the definition of our application name, now lets look at what ApplicationItems says -
syntax ApplicationItems = items:(Add | Build | ImportBinding | Comment)* => {valuesof(items)}; This syntax tells the parser that an application can have any number of Add, Build, ImportBinding or Comment in any order.
Moving on we’ll look briefly at how ImportBinding looks like -
syntax ImportBinding = importKW bindingKW path:Path";" => ImportBinding{Path = path}; The importKW (which is the literal ‘import’, look it up!) followed by the bindingKW (‘binding’) and the syntax for Path.
I could have combined both literals import and binding to a single token and mark that as a keyword, but there are two benefits to splitting them up- firstly, by having two tokens I can have as many whitespaces as I want between them, which I think is what developers generally expect, and, secondly – the 'binding’ keyword is re-used for the add binding syntax I’ll describe shortly.
I’ll skip the Path definition, you can follow it yourself if you wish to; so next we can look at another item in the application items list – Add:
syntax Add = addKW add:(Add_Reference | Add_Binding | Add_Assembly | Add_BTS_Assembly) => Add{valuesof(add)}; The Add syntax starts with the addKW (‘add’) followed by one of the syntaxes for adding a reference, adding a binding, adding an assembly or adding a BizTalk assembly, but it only allows one; the add keyword (and therefore the entire add syntax) must be repeated as a whole to add multiple items to the application, as is suggested by the ApplicationItems syntax.
Lets look at a couple of these items; first – the syntax for add binding -
syntax Add_Binding = bindingKW path:Path env:MultiWordName";" => Binding[path,env];
Here you can see the binding keyword being reused, as does the Path syntax; I’m then allowing a multi-word-name (which is essentially a string contained in double quotes) as the environment name for the added binding.
Quite simple, right? that’s the thing I love about mgrammar. let’s look at one more syntax -
syntax Add_BTS_Assembly = "biztalk" assemblyKW path:Path orch:Orchestrations? details:AssemblyDetails";" => BizTalkAssembly[path,orch,Details{details}];
syntax Orchestrations = withKW orchestrationsKW "{" type1:ApplicationName types:AdditionalOrchestrations* "}" => Orchestrations{type1,valuesof(types)};
syntax AdditionalOrchestrations = "," type:ApplicationName => type; .csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
The Add_BTS_Assembly syntax should be very clear, the only thing I haven’t mentioned so far is the ? sign which indicates 1 or 0 appearances of the syntax/token it follows, I use this to allow a BizTalk assembly to optionally describe the orchestrations it contains so that, potentially, any instances of these could be terminated when undeploying the application.
The Orchestrations syntax, if exists, requires at least one orchestration to be specified (I’m reusing the ApplicationName token as the orchestration name) but allows additional orchestratiosn to be specified as well; I’ve used the same approach for the add reference syntax.
I hope this makes sense, and that it gives you a glimpse into a practical use of mgrammar, I am certainly excited about this stuff.
Soon I hope to post about the last missing piece of the puzzle – the runtime that uses the language definition to parse, and then execute, any source code provided; after that the whole thing is likely to find a spot on CodePlex, bare with me a little bit longer….
In my notes on BizTalk 2009 post I have highlighted the fact that now, as part of the BizTalk installation, you can select the “Project Build Components” which allow you to build BizTalk projects without Visual Studio or BizTalk installed on the build machine.
This is very cool feature of the product, but I have to admit that I have missed something quite important; luckily - Mikael Hakansson hadn’t – the provided tasks do not support deployment of BizTalk projects, but only build (see Mikael’s article here).
What this means is that, while you can use the tasks deployed through the installation to verify that your BizTalk projects build correctly (as part of continuous integration setup, for example), you have no way of verifying that they can be deployed or, more importantly, run any automated tests;
The SDC tasks, with their support for BizTalk deployment, can come in very handy here, and you can certainly use them to augment the built in build support with deployment capabilities, but they add more dependencies to the plate, which has to be planned for.
Of course, attending the MVP summit in Redmond last week we had a great chance of raising this (and various other) subjects with the product group, and we certainly have not let that opportunity go missed, now we just need to wait and see if they “pick up the glove”.
Since PDC I’ve been working on and off on an “Oslo” based solution for deploying a BizTalk application; unfortunately I couldn’t get a good chunk of time to play with this, so it’s been dragging a bit, but I’m getting close, so here are some details -
I’m a big advocate of automated builds; it’s a topic that probably deserves a post of its own, so I won’t get started on this here, but the idea is that one must have a way to be confident that, when its time to [re-]deploy the app, it will get deployed successfully time after time, without a hitch;
From my experience, deploying a BizTalk application of a decent size is often not trivial, and often it does not go smoothly if done manually (things are being missed, done out of order, wrong versions being picked up, etc), which does not help to boost the confidence in the solution (or BizTalk as a whole) in the organisation; automating the process can save a lot of time (can be done unattended, while in a meeting, out for lunch or overnight), save a lot of head scratching, boost the confidence in the solution and set the ground for proper automated builds, auto testing etc.
To that extent, I have previously built an MSBuild based framework for deploying BizTalk applications.
While I’m sure it does not suit all purposes as it was developed to support those scenarios I had, I think it’s quite comprehensive and had served me well over the past couple of years.
It allows one to provide an MSBuild-formatted file with some parameters and lists and, using the pre-created framework, would do things like remove the existing application (after terminating current instances and un-enlisting services), build the new solution(s), deploy assemblies to BizTalk, add/import binding files, start ports etc.
This is working great for us, and we’ve been using it extensively for quite a while now, but there’s one major downside – it requires one to maintain those MSBuild scripts.
Now - MSBuild is cool, I do like it very much, but as it’s generic – it does not speak in BizTalk terms, and as it’s XML – it’s quite verbose, and I wanted an easier format for all of us to work with.
So – inspired after seeing Oslo and even more driven after visiting PDC - I’ve decided to come up with an MGrammar based language for describing a BizTalk application to be deployed.
So far I’ve come up with “version 1.0” of my grammar that allows to describe the major artifacts in a BizTalk application and a basic runtime to process source files written in this language, the language allows you to describe things like -
- References to other applications
- Solutions and projects to build
- BizTalk assemblies to deploy
- Orchestrations contained in those (these can then be terminated before attempting to remove the application prior to deployment)
- .net assemblies to deploy
- Binding file to import
- Binding files to add (and the name of the environment to attach to those)
As Oslo hasn’t RTM-ed yet, I can’t quite rely on it yet, and so I cannot use it in production in any shape or form, or can I?
I found a good middle ground for us which allows us to gain from the benefits I’m hoping to get by using my language, while not exposing ourselves too much to the risks of using early technology –
At the moment the runtime I’ve built for this is used to generate the MSBuild scripts we’re already using out of the source files; in this way - if Oslo disappears tomorrow, or significantly changes (not that I think that’s going to be the case), we’re safe – we still have the MSBuild scripts as “checked-in artifacts” and so we have lost nothing.
So – how does an instance of my language looks like? here’s an example:
Application MyApp
{
//add references to other applications
add reference SomeOtherApp;
add reference AndAnother,andAThird;
//build the solution
build "SomeFolder\SomeOtherFolder\SomeSolution.sln";
build "SomeFolder\SomeOtherFolder\SomeProject.btproj";
//add required biztalk assemblies
add biztalk assembly "SomeFolder\SomeOtherFolder\bin\deployment\Schemas.dll"
version=1.0.0.0
culture=neutral
publicKeyToken=314bd7037656ea65;
add biztalk assembly "SomeFolder\SomeOtherFolder\bin\deployment\Maps.dll"
version=1.0.0.0
culture=neutral
publicKeyToken=314bd7037656ea65;
add biztalk assembly "SomeFolder\SomeOtherFolder\bin\deployment\Orchestrations.dll"
with orchestrations
{
MyApp.SomeOrchestration,
MyApp.AnottherOrchestration,
MyApp.AndAnotherOrchestration
}
version=1.0.0.0
culture=neutral
publicKeyToken=314bd7037656ea65;
//and .net helpers assembly
add assembly "SomeFolder\SomeOtherFolder\bin\release\HRG.FareWatch.Hotel.Helpers.dll"
version=1.0.0.0
culture=neutral
publicKeyToken=314bd7037656ea65;
//import dev bindings
import binding "SomeFolder\SomeOtherFolder\Bindings.xml";
//add various environment's bindings.
add binding "SomeFolder\SomeOtherFolder\Bindings.Dev.xml" "development";
add binding "SomeFolder\SomeOtherFolder\Bindings.Stage.xml" "stage";
add binding "SomeFolder\SomeOtherFolder\Bindings.Prod.xml" "production";
}
as I’ve said – I already have a runtime that generates the MSBuild scripts required to deploy these - the runtime outputs several files to the temp folder -
- The “framework” MSBuild script that contains all the targets I’m using
- The Microsoft.Sdc.Tasks.BizTalk.dll and the Microsoft.Sdc.Tasks.dll which contains many useful custom tasks.
- The relevant import file for the SDC tasks
- The generated MSBuild file that contains all the various artifacts based on the source code
That latter file, which is the equivalent of my source code, and what we had to maintain so far, looks like -
<?xml version="1.0" encoding="utf-8"?>
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003" DefaultTargets="DeployAll">
<Import Project="Deploy.Framework" />
<ItemGroup>
<MyApp Include="SomeOtherApp" />
<MyApp Include="AndAnother" />
<MyApp Include="andAThird" />
</ItemGroup>
<ItemGroup>
<Bindings Include="General">
<path>SomeFolder\SomeOtherFolder\Bindings.xml</path>
</Bindings>
</ItemGroup>
<ItemGroup>
<EnvironmentBindings Include="General">
<path>"SomeFolder\SomeOtherFolder\Bindings.Dev.xml"</path>
<environment>development</environment>
</EnvironmentBindings>
<EnvironmentBindings Include="General">
<path>"SomeFolder\SomeOtherFolder\Bindings.Stage.xml"</path>
<environment>stage</environment>
</EnvironmentBindings>
<EnvironmentBindings Include="General">
<path>"SomeFolder\SomeOtherFolder\Bindings.Prod.xml"</path>
<environment>production</environment>
</EnvironmentBindings>
</ItemGroup>
<ItemGroup>
<ExternNetAssembly Include="HRG.FareWatch.Hotel.Helpers">
<Version>1.0.0.0</Version>
<Culture>neutral</Culture>
<PublicKeyToken>314bd7037656ea65</PublicKeyToken>
<ProcessorArchitecture>MSIL</ProcessorArchitecture>
<path>SomeFolder\SomeOtherFolder\bin\release</path>
</ExternNetAssembly>
</ItemGroup>
<ItemGroup>
<Solution Include="SomeSolution">
<path>SomeFolder\SomeOtherFolder</path>
</Solution>
<Solution Include="SomeProject">
<path>SomeFolder\SomeOtherFolder</path>
</Solution>
</ItemGroup>
<ItemGroup>
<BTSProject Include="Schemas">
<Version>1.0.0.0</Version>
<Culture>neutral</Culture>
<PublicKeyToken>314bd7037656ea65</PublicKeyToken>
<ProcessorArchitecture>MSIL</ProcessorArchitecture>
<path>SomeFolder\SomeOtherFolder\bin\deployment</path>
</BTSProject>
<BTSProject Include="Maps">
<Version>1.0.0.0</Version>
<Culture>neutral</Culture>
<PublicKeyToken>314bd7037656ea65</PublicKeyToken>
<ProcessorArchitecture>MSIL</ProcessorArchitecture>
<path>SomeFolder\SomeOtherFolder\bin\deployment</path>
</BTSProject>
<BTSProject Include="Orchestrations">
<Version>1.0.0.0</Version>
<Culture>neutral</Culture>
<PublicKeyToken>314bd7037656ea65</PublicKeyToken>
<ProcessorArchitecture>MSIL</ProcessorArchitecture>
<path>SomeFolder\SomeOtherFolder\bin\deployment</path>
</BTSProject>
</ItemGroup>
<PropertyGroup>
<BTApplicationName>MyApp</BTApplicationName>
<!-- Set for a remote deployment -->
<!-- Deploying BizTalk Server name - leave blank if local-->
<!-- not currently supported by runtime-->
<BTServerName>
</BTServerName>
<!-- Deploying BizTalk Server database - leave blank if BizTalkMsgBoxDb-->
<BTServerDatabase>
</BTServerDatabase>
<!-- Deploying BizTalk Server SQL user name - leave blank if local-->
<BTServerUserName>
</BTServerUserName>
<!-- Deploying BizTalk Server SQL password - leave blank if local-->
<BTServerPassword>
</BTServerPassword>
</PropertyGroup>
</Project>
Would you agree that the former source code is easier to maintain?
note: one thing you would notice is that none of the paths contains a root; I assume that this would be used by different developers/IT pros which may have the code in a different path; however, as the assumption is that the code will come from source control, my framework script expects you to provide the root path to your source code and assumes the specified paths start there.
So – what’s next?
From a coding perspective - I plan to add support in my runtime to perform the actual deployment; this is I would like it to do once Oslo is in production so I’ll add it as an option through a command line switch.
This option would tell the runtime to deploy the artifacts to a specified BizTalk server using BTSTask or the Object model instead of generated the MSBuild script.
I also want to make some modification to my language definition to make the MGraph produced cleaner (and better geared towards using the repository and Quadrant at a later stage) as well as, obviously, add support for more features, after which I plan to publish both my language and the runtime somewhere (here, codeplex, DevCente..we’ll see)
I’ll post some more notes on all of these here in the near future hopefully…
In a comment on a previous blog post Travis Spencer asked
Can you explain more about how you implemented an STS that supports both active and passive scenarios?
So here’s how -
To start with – I’ve implemented my STS class with all the logic I needed; this was done as a class library with several classes – my STS implementation, my STS-Configuration class, an STS service factory, my custom WindowsUserNameSecurityTokenHandler implementation and all the classes I needed to support my custom configuration section.
Then, in order to support an active scenario, I’ve created a WCF service and, through the SVC file, I’ve configured it to use my STS service factory class -
<%@ ServiceHost Language="C#" Debug="true" Service="<My STS configuration class>" factory="<My STS Factory class>"%>
I’ve then configured the web.config of the wcf service to support my scenario – that included all the relevant binding configuration I needed, the Geneva framework related configuration (microsoft.IdentityModel) as well as any custom configuration my STS uses.
The passive scenario can seem a little bit more confusing -
Obviously I’ve started by creating an asp.net web application; this application basically has two web pages (admittedly I’m simplifying things a little bit for clarity) – default.aspx and login.aspx
Using standard asp.net forms authentication the web site is configured to redirect all unauthenticated users to the Login.aspx page, which in turns has a pretty standard login implementation using my custom username validator logic and the framework’s RedirectFromLoginPage function to set the local forms authentication cookie.
All my web-based reliant parties redirect the user to the default.aspx page; forms-auth then redirects again to login.aspx for authentication and then, once authenticated, the user is redirected back to default.aspx; on this page I’ve simply put the FederatedPassiveTokenService control provided with the geneva framework configured to use my STS configuration class as the service; this takes care of calling the STS and posting the token back to the RP
I hope that makes sense…do let me know if it does not!
Over the last few months we've been seeing quite a few Visual Studio crashes when working with the orchestration designer.
The scenarios vary slightly between machines - some would crash on build, others simply when opening an orchestration in the designer; some would crash very frequently, some only sporadically; some would crash even when opening the solution immediately after a reboot, others would work fine for a while and then crash; the bottom line was that we were having lots of crashes, and we were losing on developer productivity and gaining tons of frustration.
Because the problems did not happen in a very consistent manner, and because there were different manifestations we simply had no clue what might be the cause; we've talked to a few people and searched the web; we’ve followed almost every tip we've received (make sure no documents are open when you build, collapse all shapes in an orchestration before saving it, etc.); some seem to have helped a little bit, but the crashes kept happening.
Sometimes Visual Studio would show an error on the screen – something like -
But in other times, it simply crashed without a warning.
Eventually we’ve contacted Microsoft support, and analysing dumps we’ve created they have found the following error -
Exception type: System.InvalidOperationException
Message: BufferedGraphicsContext cannot be disposed of because a buffer operation is currently in progress.
StackTrace (generated):
System_Drawing_ni!System.Drawing.BufferedGraphicsContext.Dispose(Boolean)
System_Drawing_ni!System.Drawing.BufferedGraphicsContext.Dispose()
System_Drawing_ni!System.Drawing.BufferedGraphicsContext.AllocBufferInTempManager(System.Drawing.Graphics, IntPtr, System.Drawing.Rectangle)
System_Drawing_ni!System.Drawing.BufferedGraphicsContext.Allocate(IntPtr, System.Drawing.Rectangle)
System_Windows_Forms_ni!System.Windows.Forms.Control.WmPaint(System.Windows.Forms.Message ByRef)
System_Windows_Forms_ni!System.Windows.Forms.Control.WndProc(System.Windows.Forms.Message ByRef)
System_Windows_Forms_ni!System.Windows.Forms.Control+ControlNativeWindow.OnMessage(System.Windows.Forms.Message ByRef)
System_Windows_Forms_ni!System.Windows.Forms.Control+ControlNativeWindow.WndProc(System.Windows.Forms.Message ByRef)
System_Windows_Forms_ni!System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr, Int32, IntPtr, IntPtr)
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
From this it was clear that the issue is related to GDI+ when VS redraws the orchestration and MS have confirmed they have seen this issue elsewhere (outside the BizTalk realm), but we were told that this problem was “aggravated by the fact that BizTalk creates very large device-independent bitmaps and uses the same colour depth than the desktop.”
Unfortunately, according to Microsoft, there are many contributing factors to this issue, and it is virtually impossible to isolate the cause(s) in our case, but two recommendations were made -
- Use a video card with more memory
- Lower colour depth setting from 32 bit to 16 bit in the display settings in windows
- Split large orchestrations to smaller ones
That last recommendation would have been our last resort – not only it would require a significant re-factor-re-test effort as our solution is quite big, it would seriously decrease our already too long deployment time.
We were quite happy to follow the first recommendation, but obviously trying the second one was pretty effortless we went with that first and what do you know – it worked!
Machines which have changed the colour depth setting to 16 bit stopped having those frequent crashes;
It must be said we’re still experiencing some crashes, so there are other issues as well, but these are more specific, and are another story altogether, one to be told once the end is known!
Mostly a reminder for myself, but hopefully useful to somebody else -
Often it is important to specify a specific user for a service to run as; it appears the setup is completely different when using IIS 5.1 or 6 (and higher).
When using IIS 5.1
- Set the anonymous user on the virtual directory to the user you want to run as.
- Disable any other authentication method on the vdir
- In the web.config turn impersonation ON (<identity impersonate="true" /> under System.web.)
- Under System.serviceModel add <serviceHostingEnvironment aspNetCompatibilityEnabled="true"/>
- To the service class add [AspNetCompatibilityRequirements(RequirementsMode = AspNetCompatibilityRequirementsMode.Allowed)]
.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
When using IIS 6.0
- In the web.config turn impersonation OFF(<identity impersonate="false" /> under System.web.)
- Set the required identity in the app pool your virtual directory runs under.
(which one do you prefer? ;-))
In my previous blog post I’ve described how I consumed a service that uses ws2007FederationHttpBinding from BizTalk Server; my next task was to expose an orchestration as a WCF service that uses this binding.
In that post I’ve described what I think is a bug in BizTalk R2/2009 which prevents me from setting the issuer configuration through the UI.
When consuming such a service this configuration exists in the send port, and I’ve managed to get enough time to manually edit a BizTalk bindings file to import in order to set the required configuration I can’t set through the UI; I have tried to do the same for the receive location, but the binding file's format is different and my few attempts before my time ran out did not work.
I have submitted this buy to Microsoft, and hopefully they’ll provide a fix/some information about a possible work around; I suspect my best bet is to use the explorer OM to configure the receive location, but I just can’t get the time to do it now, so – I’m afraid this post is mostly a place holder, and I really hope I’d be able to come and update it with some more details soon.
Finally I’ve reached the point where I’m ready to hook up BizTalk to my STS implementation to participate in a federated identity scenario.
My goal is to confirm two scenarios -
1. Being able to call from a BizTalk process a service that uses the ws2007FederationHttpBinding (and requires that the caller provide a token issued by a specific STS)
2. Being able to expose a service in BizTalk that would use the ws2007FederationHttpBinding requiring the caller to provide such token.
If you followed my previous posts you would probably know that I already have a fairly extensive federated identity scenario – I have a custom STS built using the Geneva Framework, a few web sites and a test WCF service all use the STS for their authentication (and, indirectly, authorisation), so it is just a case of hooking BizTalk to all of that, and to start with I decided to try and consume that test service from a BizTalk process.
I’m using BizTalk 2009 Beta to keep things interesting, but I’m pretty sure it would all be exactly the same for R2.
The beginning was easy – in my BizTalk project I’ve used the “Consume WCF Service” wizard from the “Add Generated Items” menu to generate all the artifacts needed to consume my service; the generation went very smoothly and soon enough I had an orchestration that could receive a dummy trigger message, create the service request, receive the service response and deliver it to a send port for me to check it out.
I’ve deployed the project and then imported the binding file generated by the wizard to create the WCF send port.
Checking the port configuration, though, I found a few issues -
The port uses the WCF-Custom adapter, which was correctly configured with the service’s endpoint and the ws2007FederationHttpBinding; however – my STS endpoint is configured to use the ws2007HttpBinding and requires UserName credentials, but these settings were not copied over to the client configuration from the service's contract, the client configuration should have looked like -
<security mode="Message">
<message algorithmSuite="Default" issuedKeyType="SymmetricKey"
issuedTokenType="http://docs.oasis-open.org/wss/oasis-wss-saml-token-profile-1.1#SAMLV1.1"
negotiateServiceCredential="true">
<issuer address="[STS URL HERE]” binding="ws2007HttpBinding" bindingConfiguration="stsBinding">
<identity>
<dns value="STS"/>
</identity>
</issuer>
<issuerMetadata address="[STS MEX ENDPOINT HERE]" />
</message>
</security>
(with corresponding Ws2007HttpBinding named stsBinding with all the relevant configuration), but the binding attribute and the bindingConfiguration attribute were missing.
It is worth noting here that this problem is by no means BizTalk specific – exactly the same thing (missing configuration) happens when I add a reference to my service from a ny standard .net app.
Also worth noting that in my case, and I don’t know how representing this is, this resulted in WCF trying to open the CardSpace card selector when I tried running the process, resulting in an error - “The channel is configured to use interactive initializer 'System.ServiceModel.Security.InfocardInteractiveChannelInitializer', but the channel was Opened without calling DisplayInitializationUI. Call DisplayInitializationUI before calling Open or other methods on this channel.”
Easy enough to fix, I thought, as I opened the adapter’s configuration for the send port finding that indeed there are the equivalent fields in the UI (the image shows them AFTER I have typed in the values I needed) -
However, after adding the missing two values and confirming the changes I kept getting the same error; it only took two minutes of head scratching to realise what was going on – opening the send port configuration again showed that the values I’ve carefully typed in have not been persisted, and are now gone again; this must be a bug, which I’ve since confirmed exists in BizTalk 2006 R2 as well;in fact – even the issuer address, which was previously there, was removed from the configuration when trying to apply my additions.
Luckily importing these settings through a bindings file does work (as was evident from the fact that the issuers address did exist initially), and so I’ve edited my bindings file manually to include the missing settings (as well as the issuer identity setting which I later found out was missing for my scenario) and imported it using the admin console; checking the send port configuration I could now see my values for the “binding” and “bindingConfiguration” properties.
This raised another question, though – how do I configure this binding? ws2007FederationHttpBinding is somewhat “special” in the sense that there’s effectively an endpoint inside an endpoint:
Using this binding you have your service’s endpoint and it’s binding configuration; then the message security element of the ws2007FederationHttpBinding has the issuer element which is effectively the endpoint of the STS with the address, the binding and the binding configuration (there is no need for a contract attribute as the contract is known – it is the wsTrust contract); this means that I now need to have another set of binding configuration in my WCF configuration – for the STS – which in my case uses ws2007HttpBinding, but there isn’t a way to add another binding through the send port configuration dialog directly.
Initially I tried to add the relevant configuration to the BizTalk configuration file (BTSNTSVC.exe.config), but it didn’t work – and I can’t quite explain why yet.; what did work, which is good enough for me for now, was to put that exact same configuration in the machine.config; no idea why, but it works.
Another options, described in the documentation is to use the Import option or the ExplorerOM API.
And so – once I’ve done that, when I ran my process it called the send port going to my service, but the WCF adpater contacted my custom STS first, retrieved a security token from it, and then called my test service passing this token, which meant the service was allowed to execute.
I’m impressed!
There is one, fairly big, inefficiency here now, though – BizTalk does not appear to be caching service proxies, which means it will not cache anywhere the issued token, which in turn means a call to the STS must precede every call to my protected service; not ideal.
Luckily - one of the WCF samples provided by MS demonstrates how to create a behaviour that would cache such tokens locally; at some point (probably not right now, unfortunately) I will have to see how well this plugs into BizTalk (if at all), but I suspect it’ll work well; find that sample here.
I wasn’t sure how bit problem this was, and in any case thought that optional parameters are a VB thing, but I’ve been proven wrong, and not only because optional parameters are soon to become a c# thing :-)
When using call/start orchestration we increasingly, as our solution “matures”, find that we need to add optional parameters to our processes..
In c# I was trained that function overloading is the correct way to implement this, and although I can see the benefit of the slim optional parameters approach I do think that it is much clearer; we’re talking BizTalk though, and you can’t really overload a function, can you? :-)
So we [can] do several things -
When dealing with .net types we can simply break the interface, add the parameter, change all the calling processes to supply one, and – if they don’t actually need to supply a value for this new parameter make them pass one anyway (an empty string, or a 0, or even –1); not elegant to say the least, and quite time consuming which means upsetting both the architect and the project manager.
Same approach works quite well when dealing with optional messages as well, only that now constructing that unnecessary extra parameter, in the same of a content-less message, is even harder (see my post on creating messages from scratch).
Another option, valid for .net types only, is to have a <something>Parameters class which would include all the parameters a process takes as properties; this would allow you to add as many properties as you need, without forcing callers to specify them; there are downsides of course – it is less readable and intuitive in my view, there’s a bit more hassle for the caller (create instance of parameters class, assign members) and now it’s harder to force those non-optional parameters (can be done via the parameter class’ constructor)
But there’s not equivalent option for optional message parameters.
Last, but not least, there’s the closest thing to overloading – you could always create a second process, with the extended set of parameters and have the original process forward the original call while providing this empty parameter, effectively saving the hassle from the original caller.
This is really how a function overloading solution would look like only that a process is not a function; it has a lot more impact in term of compile time, configuration required, deployment so the overall maintenance cost of such an approach is quite enormous for any decent sized project.
So – I can no longer ignore this problem and have to agree – allowing optional parameters with default values, for orchestration is quite important.
2011? :-)
Working on my Geneva Framework based STS scenario I’ve stumbled into a very weird and annoying case where by if the user typed a Url in the wrong case (compared to the case of the V-Dir) the browser would enter a circular redirect between the STS and the RP.
I’ve started a forum thread, which you can find here, that got an answered by Peter Kron from MS through which I’ve learnt that the path portion of a cookie is case sensitive; you can find this in this RFC spec as well (read 3.3.3) -
…the old and new Domain attribute values compare equal, using a case-insensitive string-compare; and, the old and new Path attribute values string-compare equal (case-sensitive). …
I don’t know if that’s just me, but I find this really surprising as, as a web user, I was never “trained” to tread urls as case sensitive, but it appears that, according to the spec, any personalisation stored for a particular path might be lost if I enter the wrong url?
In the STS scenario case this would mean potentially me having to login again, although I have already logged in on the STS.
Peter suggest to store the cookie against the domain, which is not case sensitive, and is good enough for me (for now?), but I don’t know if that’s realistic for all scenarios…..
earlier this week Microsoft announced the release of the beta version of BizTalk 2009.
I’m sure detailed posts of various bits will follow soon, but for now I thought I’d list a few points I’ve picked up (in no particular order)-
BizTalk projects are now “first class citizens” of Visual Studio [2008]; in practice it seems they are really “special” c# projects.
This means quite a lot really, to start with, for the most part they look and feel like c# projects (in the beta build the icon for the project is even the c# icon), but this extends to the property pages, existence of assemblyinfo.cs and the project file structure (which is now pretty much standard msbuild file)
It also means that you could add any c# type to a BizTalk project although in the current build you can’t add a new class using the menu, you are always redirected to the template selector which only shows BizTalk types, so –even if I pick the Add new class option
I get the following dialog -
Not sure when this will be fixed, but the workaround is simply (yet annoying) – create the type elsewhere and use the “Add Existing Item” option.
This will allow you, for example, to include any helper classes you use from your orchestration with the process itself.
Another point to notice is that this is only true for C# types. you cannot add VB code files to a BizTalk project.
Also – as the project is now a standard msbuild project integration with TFS is tighter.
interesting to see in the installation process the following entry -
Basically installing this on a computer that has msbuild allows you to build BizTalk projects without having BizTalk or VisualStudio installed; using a build server for BizTalk is so much easier now!
The build process is completely incremental – each BizTalk artifact is a c# class, and so when building a project only those classes that have changed will be built, which should shorten the time it takes to build a BizTalk project.
Another fantastic addition is more “HAT” features in the admin console, but as Randal already blogged about this one in details I’ll point you there
Upgrading seems to be a breeze and both 2006 and 2006 R2 are supported. this can be done by opening any existing BizTalk project in VS 2009 and the migration wizard will pick it up and migrate it for you. as expected.
you can open a solution with multiple projects and they will all be upgraded and you can use devenv /upgrade with both projects and solutions to do that from script?
BizTalk 2009 supports debugging maps – this is done by selecting the “Debug Map” from any BTM file context menu -
This would use the test input instance and test output instance properties, and would basically open the XSL generated for this map and break at the root template
Worth noting that currently this does not support maps with multiple inputs or ones using extension objects.
I’m sure there’s a lot more to discover, but I have a 4 day old baby at home – so back to those nappies for me I’m afraid.
Over the last few weeks I’ve been working on implementing a Geneva Framework based STS that supports both active and passive scenarios.
This is progressing very well and I already have a fairly solid PoC running for both scenarios.
Generally, to make any web site participate in the federated identity “dance” all it takes is some configuration on the application’s web.config (separate post coming shortly), but up until today I have only done so for web applications developed as a .net 3.5 project.
Today I have created a bog standard .net 2.0 web application project (on VS 2008, though, not that it should matter) and applied exactly the same configuration as I did on the other web sites.
when I ran this, not so surprisingly, everything just worked. isn’t it great?!
Of course – considering that everything is implemented as HttpModules – this should not be very surprising, but still.
It is important to remember, though, that .net 3.5, as well as the Geneva Framework, must be installed on the machine running the web site for this to work of course.
Most people know that when processing a message through the XmlDisassembler, if not explicitly told which schema to use through configuration, the disassembler would try to resolve the correct schema based on the message’s root node and namespace.
Most would also know, usually through the experience of getting it wrong so many times first, that if more than one assembly contains the same combination of root node and namespace for a schema, the receive pipeline, containing the disassembler, would fail with the error “Cannot locate document specification because multiple schemas matched the message type “<your message type here>”.”
What some miss (ehm, ehm) is that this is not true if the two schemas exist in two versions of the same assembly.
So – if you have an assembly called MySchemas.dll, version 1.0.0.0, …. which contains schema SomeRootNode#SomeNamespace and you have MySchemas2.dll, Version…… which contains schema SomeRootNode#SomeNamespace (same one) – BizTalk would fail.
If, however, you have the assembly MySchemas.dll, version 1.0.0.0, …. which contains schema SomeRootNode#SomeNamespace and then create MySchemas.dll, version 2.0.0.0, …. which contains schema SomeRootNode#SomeNamespace BizTalk would quite happily ignore the former and use the latest version available.
Makes perfect sense, of course!
We’ve been experimenting with calling ASMX web services from orchestrations without having to add a web reference (for the SOAP adapter) or use the generated items (for the R2 WCF adapter).
The idea, in short, is to achieve increased decoupling between systems even in a web service scenario -
Generally when you add a reference to a service in BizTalk 2006 or in R2 (although there are some clear differences between the implementation) the schemas for the request and response types are generated for you as well as an orchestration which defines message and port types using those schemas.
When using the SOAP adapter the types generated are somewhat “special” and they encapsulate a little bit of black magic; luckily the WCF adapter which shipped with R2 is much better in the sense that there’s nothing special about any of these artifacts (which also explains why it is now “Add Generated Items” and not “Add Service Reference” – as this is all it’s doing).
What this means is that if you follow the path that BizTalk leads you through you will get all these artifacts in the same assembly with your orchestration, which means you are now tightly coupled with the web service contract; not the end of the world, but if you want to stay true to the idea behind BizTalk - in which your processes can be masked from changes in the other applications you have to play pretend a little bit.
We thought that if we had the web service schemas in a separate assembly, and our process only used it’s own representation of the data (which would, ideally, be less than the entire data provided by the, mostly generic, web service) we could then map between the two in the port rather than in the orchestration, which would mean that if the web service changes, all we will need to do (in theory, at least) is re-deploy the assembly with service’s schemas assembly the and the map.
So – how I went about doing that with the WCF adapter -
Following best practice I had an assembly to hold all of “My” schemas – these are the ones describing entities in my domain.
I then created an orchestration assembly to contain my orchestration, which references the schemas assembly; the orchestration assembly has no other dependencies.
I then created a third assembly to include all the types for the service - I went through the “Add Generated Items” wizard to get all the artifacts, but I only really used the schemas (and not the message or port types); this assembly, like the schemas assembly, has no dependencies.
I then progressed to create a fourth assembly to hold the mapping between my schemas and the service’s schemas; naturally this assembly references both projects, but, crucially, it is referenced by no-one.
So – at the end of this we get the following -
I then imported the send port bindings generated by the wizard to create the send port; I could have quite happily created it from scratch as there’s nothing special in that port - with the exception of one point, discussed next - so this was really just to save me some time, and added the two maps I’ve created to map the process output format to the service request and the service response to the process input format.
Goal achieved – the process knows nothing about the service – all is done externally to the process through port configuration.
But did it work? Almost - running this scenario I received a soap fault from the service complaining about a misunderstood soap action; makes sense I thought – how would BizTalk know which service operation I wanted?
Well, the WCF adapter has a very nice way to figure out the soap action to use (in my view) – as part of the port configuration there’s a bit of xml that provides mapping between an orchestration send port’s operation name and the required soap action; the setting looks something like this -
In the generated port type the operation name matches the operation name in the service description (“HelloWorld”, in my example), which, in turn, is mapped through this xml to the relevant soap action; as I did not use the generated types the operation name did not match – I simply left it as the default “Operation_1” (naturally…); that meant that when the request came the adapter failed to find a matching operation.
Somewhat annoyingly, what the adapter does when it can’t resolve the name is to assume that the entire setting should be used as the soap action and so the entire xml was written to the header; this behaviour is there to allow one to specify a fixed header to use, but I think the experience could be a bit better there – they could have had two different settings, or at least realise that if I’ve put a BtsActionMapping xml in there I do not intend for it to be used as the header itself(!), and so, if the relevant entry was not found the request should be suspended rather than going out incorrectly to the service; never-the-less the operation could not be resolved, of course, and the service returned a soap fault.
Fixing the issue was easy and simply meant adding the correct entry in the xml and running the scenario again, this time it completed successfully.
How does that differ using the SOAP adapter?
Using the SOAP adapter the approach was naturally very similar; pretty much the same assemblies, pretty much the same artifacts; there are three key differences though -
For starters the soap adapter requires a proxy; in most scenarios you’re using a web port type which provides the adapter with a proxy and so in most cases you don’t have to worry about this at all; I can imagine that some are probably not even aware but the send port, using the SOAP adapter, will have the web service proxy set in the “Web Service” tab of the adapter configuration to “Orchestration Web Port”.
Alternatively you can provide a custom proxy class, which is a topic by itself (and you can check it out in Richard Seroter’s post on the topic here), but in most standard cases this is not required.
As I’m not following the “standard” approach I had to create a custom proxy for my send port; I did this by using WSDL.exe and configuring the proxy class in the send port as described in Richard’s post.
In my case, however, unlike Richard’s, I did not wish to pre-defined the method called in the send port; luckily – the configuration allows you to set it to “Specify Later”, which means the method name will be provided per request through the message context (using the SOAP.MethodName property).
Taking the “Specify Later” approach means I don’t have to have a send port per method, which is good of course, but pay attention to my note regarding number of ports in the summary below.
Now that I have the send port and proxy configuration sorted I needed to get the web service’s schemas; I could do that by using XSD.exe and add the output to my service types assembly.
Last thing – when using the soap adapter you don’t generally need to have an XmlDisassembler in the pipeline; however – if you want BizTalk to be able to run a map it needs a “proper” message type in the context, not that awkward one the SOAP adapter puts, and so the XmlDisassembler becomes mandatory in this scenario.
other than that everything else is pretty much the same.
So – to summarise –
Calling a service from a process, without the process knowing ANYTHING about the service implementation is very easy, the story is slightly better in the WCF adapter case in my view, but both seem to me quite reasonable.
The only downside to this approach that I could think of so far, is that you are likely to end up with as many send ports as you have response output formats -
As far as requests from the orchestrations to the web service are concerned BizTalk will quite happily pick up the right map from a list of configured maps based on the input; so - if process A has one output format and Process B has a different output format and they both share the same send port – BizTalk will pick up the relevant map to convert either outputs to the service’s request.
On the way back, however, the incoming message (the service’s response) always looks the same, and so BizTalk will have no way of knowing which map to pick from the list.
That means that multiple send ports will have to be created for such cases so that there’s only one map for the service’s response; a large number of those may have some impact on the overall performance of the server group as the number of subscriptions that need to be evaluated increases; what “large” means in this context and how big is the impact is not something I could say easily, so I’d suggest doing some benchmarking to find out in your environment if you are concerned.
It took me a little while (and quite a bit of help from others on this thread) to get to a relatively simple implementation, so I thought I’d summarise the steps I’ve taken –
At the risk of sounding the obvious I would definitely recommend making sure the overall STS scenario works well using windows authentication before changing it to support custom authentication.
Once that’s done change the STS’ bindings’ clientCredentialType to UserName and the establishSecurityContext to false.
<ws2007HttpBinding>
<binding name="UserNameAuthentication">
<security mode="Message">
<message establishSecurityContext="false" clientCredentialType="UserName"/>
</security>
</binding>
</ws2007HttpBinding>
The equivalent changes need to be made on the clinet’s binding going to the STS. these may not be obvious at first glance – on the client you will have the endpoint representing the RP and using ws2007FederationHttpBinding; inside this binding’s configuration you will find the issuer element, which is somewhat similar to an endpoint; this represents the STS’ endpoint and as such has a binding (ws2007HttpBinding) and a binding configuration; it is in that binding’s configuration that you need to change the credential type. setting it on the wrong bindings, as I did initially would send you back a couple of hours :-)
Next, as we’re using username authentication for the STS, a service certificate must be used so that the credentials can be encrypted. This is done through configuration of a service behaviour on the STS service as such:
<behaviors>
<serviceBehaviors>
<behavior name="STSBehaviour">
<serviceCredentials>
<serviceCertificate findValue="STS" storeLocation="LocalMachine" storeName="My" x509FindType="FindBySubjectName"/>
</serviceCredentials>
<serviceMetadata httpGetEnabled="true"/>
<serviceDebug includeExceptionDetailInFaults="true"/> <!—use for debug only -->
</behavior>
</serviceBehaviors>
(don’t forget to wire the behaviour to the service...)
As the test certificate I’m using are not valid, I needed to disable validation on the client side; I could not find a way to do this through configuration, as at the client there isn’t an endpoint as such for the STS service, just the issuer element in the ws2007FederationHttpBinding, so I’ve done this in the client code (this is a temporary measure for development only!) –
proxy.ClientCredentials.ServiceCertificate.Authentication.CertificateValidationMode = X509CertificateValidationMode.None;
proxy.ClientCredentials.ServiceCertificate.Authentication.RevocationMode = X509RevocationMode.NoCheck;
The current version of the Geneva Framework, unlike Zermatt before it, does not support the userNameAuthentication element in the serviceCredetial service behaviour. (to be accurate you can kind of force it to do so, but that’s planned to be blocked in the near future, so for all intents and purposes you should not include this element, see more information in the thread mentioned above)
In order to implement authentication a customer SecurityTokenHandler needs to be added; to do so created a class that inherits from WindowsUserNameSecurityTokenHandler and overridden the ValidateToken method; again – several samples of such implementation exist on the thread but the idea is to validate the username and password (made available through the SecurityToken parameter to the method) in whatever way you wish and then, ideally, add some claims to the ClaimsIdentityCollection output; this should generally include the identity, authentication method and authentication instant, but you can add whatever you wish.
To wire the custom handler to the STS service a bit more configuration is required on the STS side -
<microsoft.identityModel>
<securityTokenHandlers>
<remove type="Microsoft.IdentityModel.Tokens.WindowsUserNameSecurityTokenHandler, Microsoft.IdentityModel,Version=0.5.1.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"/>
<add type="MyUserNameSecurityTokenHandler, MyUserNameSecurityTokenHandlerAsembly"/>
</securityTokenHandlers>
</microsoft.identityModel>
This replaces the built-in WindowsUserNameSecurityTokenHandelr with my class that inherits from WindowsUserNameSecurityTokenHandler and adds custom implementation.
Note: I needed to add the definition of this section as such –
<section name="microsoft.identityModel" type="Microsoft.IdentityModel.Configuration.MicrosoftIdentityModelSection, Microsoft.IdentityModel,Version=0.5.1.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"/>
I hope that makes sense….
A couple of days ago I've posted about the changes I've had to make to allow my custom STS to work with the updated Geneva framework. there's one more, quite crucial, change that I had to make, which I will try to describe next -
If my understanding is correct (and unfortunately there's all the chances in the world that it is not, so if you know otherwise please do comment) the October Geneva SDK has tightened security a little bit around token validation.
I believe that the previous version of SDK, the RP simply made sure that a token was included with the request, and that this token was signed by a party whose certificate exists on the server (and is accessible); the RP did not check which certificate was used to sign the token.
As far as I can tell the Geneva Framework SDK now behaves differently - if you execute the same code and configuration you had before (baring necessary changes to allow the code to compile on the new version, but these are mostly name changes) you will get the following error from the RP:
"An unsecured or incorrectly secured fault was received from the other party. See
the inner FaultException for the fault code and detail."
Basically the Client gets a token from the STS and attaches it to the request but the RP does not recognise the issuer of the token; in order to instruct the RP to accept tokens signed by a particular STS you need to provide it with a list of issuers you accetps, this can be done using the following configuration for example -
<microsoft.identityModel>
<issuerNameRegistry type="Microsoft.IdentityModel.Tokens.ConfigurationBasedIssuerNameRegistry, Microsoft.IdentityModel, Version=0.5.1.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35">
<trustedIssuers>
<add name="STS" thumbprint="7a0671d475673c1ab131ca1c0c804e4fbd385140"/>
</trustedIssuers>
</issuerNameRegistry>
</microsoft.identityModel>
This bit of configuration lists all the certificates that are acceptable as STS token signing.
It is interesting to note that this model is completely extensible - you can define your own registry IssuerNameRegistry type that would look and behave differently if you have other means of listing those; the same can also be done via code, which is the example provided with the SDK - you define a custom IssuerNameRegistryClass -
namespace ClaimsAwareWebService
{
public class TrustedIssuerNameRegistry : IssuerNameRegistry
{
/// <summary>
/// Returns the issuer Name from the security token.
/// </summary>
/// <param name="securityToken">The security token that contains the STS's certificates.</param>
/// <returns>The name of the issuer who signed the security token.</returns>
public override string GetIssuerName( SecurityToken securityToken )
{
X509SecurityToken x509Token = securityToken as X509SecurityToken;
if ( x509Token != null )
{
//Note: This piece of code is for illustrative purposes only. Validating certificates based on
//subject name is not a good practice. This code should not be used as is in production.
if ( String.Equals( x509Token.Certificate.SubjectName.Name, "CN=STS" ) )
{
return x509Token.Certificate.SubjectName.Name;
}
}
throw new SecurityTokenException( "Untrusted issuer." );
}
}
}
and then when configuring the host for the RP service you provide this as a parameter -
FederatedServiceCredentials.ConfigureServiceHost(host, new TrustedIssuerNameRegistry());
And while I'm on the subject - as this has sent me going in circles - it appears that the framework is not happy with claim-less tokens, so if you're dumb enough (as I was) and end up not adding any claims (I was adding them base on the requested claims in the incoming request, which, at some point, was empty in my configuration) you will get a error, which, after setting the ServiceDebugBehavior would read "A SamlAssertion requires at least one statement. Ensure that you have added at least one SamlStatement to the SamlAssertion you are creating."
I can't decide about this one - does it not make sense to have a scenario in which you just want to get a signed token to indicate that an STS has authenticated the caller, but don't actually need any claims? not that it's a problem to find at least one claim to add (identity, authentication method are two easy examples), but speaking in principal I'm not yet convinced not having any specific claim should be an error.
A few weeks ago I published this post about some experiments me and Randal Van Splunteren did around message creation.
Not surprisingly I was asked to post the solution we've used and so I have uploaded it here
Have fun! (let me know if anything's missing or unclear, it's been a while since I ran this...)
I have already mentioned that Zermatt has been renamed as the "Geneva Framework", which makes total sense.
At PDC Microsoft have released a new download for the "Geneva Framework", which I have downloaded today to check some of my code against;
While not at all an extensive list, here are the changes I had to do to my code to get it to work with the updated framework -
On the STS:
- The SecureTokenService class, which is the base class for any STS implementation has moved to the main Microsoft.IdentityModel namespace (it formerly existed under it's own namespace - Microsoft.IdentityModel.Service)
- The GetScope method of the SecureTokenService is now marked as abstract and so has to be implemented (I believe it previously was not abstract so a base implementation could have been used, either directly or indirectly through an overriding method;
- ClaimsPrincipal no longer has a 'Current' property, you can get the claims principal from an IClaimsPrincipal instance using the CreateFromPrincipal method or from an IIdentity instance using the CreateFromIdentity method.
- GetOutputSubjects renamed to GetOutputClaimsIdentity, the order of the parameters has changed a bit (but otherwise remained the same) and the return value is now IClaimsIdentity and not ClaimsIdentityCollection (which, again, makes perfect sense)
- In the STS service configurationI have changed the bindings from wsHttpBinding to ws2007HttpBinding and the STS contract from IWSTrustFeb2005SyncContract to IWSTrust13SyncContract.
On the RP:
- ExtensibleServiceCredentials, which is used to configure the RP's host to use the Geneva Framework is now called FederatedServiceCredentials
- To get the list of Claims in the RP you no longer use something like "(IClaimsIdentity)ClaimsPrincipal.Current.Identity;" but instead check the CurrentPrincipal of the current thread - "IClaimsIdentity identity = Thread.CurrentPrincipal as IClaimsIdentity;"
I'm currently doing some work with the Geneva Framework (formerly known as "Zermatt"), which I am very excited about;
With the SOA wave and now the coming Cloud wave, federated identity becomes a crucial component in the enterprise and it is great to see such a good story for it from Microsoft.
Using the "Zermatt" SDK (I now need to download the updated framework and align with it) I have succesfully, and quite simply, managed to create both an active STS scenario and a passive STS scenario, both sharing the same underlying STS code; this was a great experience and I hope to post some more details over the next few days.
I was, however, a little bit surprised by the behaviour of the framework around non-optional claims -
In my scenario the RP (=relaying party, the service the client actually want to call) indicates through its configuration that it requires a specific (custom) claims, which is not optional -
<security mode="Message">
<message>
<claimTypeRequirements>
<add claimType="http://myCompany/claims/someClaim" isOptional="false"/>
<add claimType=http://myCompany/claims/someOtherClaim isOptional="false"/>
</claimTypeRequirements>
<issuer address="http://localhost:6000/STS"/>
<issuerMetadata address="http://localhost:6000/STS/mex"/>
</message>
</security>
When the client adds a web reference to this service, it is correctly configured with the STS details and the required claims (not posted here, I will try and describe my scenario in detail in a separate post) and so when it calls the service, WCF ensures it first hits the STS requesting the claims as indicated in the config.
You would all probably know that when thinking about any aspect of security in WCF the story is very “tight”, in the sense that you could set up pretty much all the requirements in configuration should you wish to and you could trust that the service’s code will never get executed if these are not met; I believe this is a key design point for WCF - the implementer of the method should not need to worry about how authentication is implemented, nor should you need to change the code if you decide to change your authentication method.
Considering this I expected the STS to try and provide all the claims it can based on the request message and/or configuration for the RP, and then I would expect the channel on the RP side (using the "Geneva" Framework to reject any requests that arrived without all the non-optional claims BEFORE calling the service’s code.
When testing my scenario I deliberately set the STS code so that it does not provide the required and was surprised to find out this was not the case.
My service's method was called whether both claims existed or not; I did have, of course, full access to the claims in code and so it was fairly easy to validate the existence of the claims required, but this seemd a little misaligned with the WCF approach to all the other security aspects and quite wrong frankly.
I could not find much help online (this is still early days for the framework), and checking with a couple of people they all confirmed both my observation and my expectations; luckily for me, though, I was able to attend PDC and so I made sure to give a visit to the Identity folks' booth.
I'm happy to say that they as well have confirmed that the expectation is quite valid and indeed, they expect this behaviour to change before RTM; hopefully this will happen which would keep things nice and tidy.
Microsoft have announced "Dublin" in September, but up until now (PDC) there' has been very little information about how that is going to look now.
Over at PDC are finally able to see some sessions as well as visit the various .net booths and get first hand look at the "bits" and ask questions about the new technologies.
However, at this time of writing this, I have not yet attended any Dublin sessions, which really means anything I write here currently is stipulating on some stuff I've seen and heard "in the corridors of PDC".
I have, however, spent enough time at WCF/WF booth to probably make them call me various names behind my back (I'm sure) :-) and I think have got at least an idea of what Dublin is.
Over the next couple of days we will hear and see more, as well as get the chance to play with "the bits" handed over yesterday afternoon, so I'm hoping to be able to give a fuller view of what it is (and correct any inaccuracies posted here now)
So - clearly at this point I can only speculate based on the stuff I've seen, so you can think of this post as me sharing with you the (slow) process of learning what Dublin is (or, of course, you can stop reading now).
"Dublin". to begin with it is an "add on" (or extension, if you prefer) to IIS manager; once installed you get another set of options when administrating a virtual directory.
These options, as you can imagine, are directly related to configuring, managing and monitoring WAS hosted WCF services; and so here's the first important point - Dublin is not a new host, it is a management layer on top of the existing host - WAS.
Another point is coming out of this - and again I'm stipulating and mostly trying to make sense of the stuff I've seen over the last couple of days - if "Dublin" is not a new host - where does WF fit in? well - MS are pushing really hard the message - WCF+WF (or WCF activated Workflow).
Until now I mostly thought of WF as a workflow capability one can add into one's application as an integral process; sure WF can work with WCF very well and there is , and have been for a while now, a good story around this and several examples; but, I guess coming from my BizTalk perspective, I thought that if you needed a long running, WCF activated, process you'd use BizTalk; now MS are pushing a different alternative - WF+WCF solution hosted in WAS.
The thing is that this is not a new option; as I've said - it has been around for some time now, the difference is mostly strategic I think - there's still a lot of place for BizTalk (more than I though last week, admittedly) - but if you need a fairly light weight solution for WCF activated workflow this is definitely a good option.
Back to "Dublin" - if WCF + WF have been around for a while - what is new? well - as I've suggested the main feature you'd see or hear about is administration - using Dublin's extension to IIS you would be able manage your deployed services more easily; this includes things like configuring the tracking and persistence databases, configuring tracking settings on your services (and workflow), and - I'm told - possibly configuring a lot of aspects of your endpoints (although this was not demonstrated).
So - one way to think about it, and I hope I'm not doing any mis-justice, is an improved and extended WCF configuration editor embedded into IIS (as well as supporting some aspects of WF configuration).
Another aspect, possibly one with a higher impact, is the ability to view long running instances; as MS expected you services to activate WF (there's little benefit for using the Dublin features otherwise, but the configuration tools are still very much relevant)
As your workflow runs it may wait for external events, sometimes for minutes (or hours, or days), in which case it is more than likely (one would hope) that it will get persisted using a persistence service; Dublin has a UI over that service (or is it the database? in which case it is limited to the out-of-the-box persistence implementation) which allows you to view all your running (and "dehydrated" workflows; this is something we BizTalk guys take pretty much for granted but is a blessing for anyone serious about using WF.
If you had WF tracking enabled you could drill down into your tracking to see exactly what the WF have been through already, which would be very useful, but at the moment it is a bit rough - you get effectively the contents of the database displayed on screen, nothing like the visualisation WCF tracing has to offer.
I'm told that there are thoughts (or was it concrete plans?) to take this are further and look at combining the WCF and WF tracing/tracking and offering a better, consolidated, view of the data, one that actually helps making sense of it all; but that is not there is this very early version.
There's a bit more to the administration capabilities, but that gives you, I hope, an idea of what's on the cards - better management of WCF+WF scenarios on WAS.
In addition to that a few other, quite cool, features are talked about; for example "Dublin" introduces an ability to add a forwarding service to the solution; this is a pretty bog standard WCF service that get's added to your solution as an SVC file (specific to the scenario you are building), but it uses an implementation provided by MS internally (references assembly, presumably); I'm told you would configure some rules that will route incoming requests to your back-end services (currently these rules are configured through the config file, but clearly one can imagine this being managed through the IIS administration console as well).
Basically the service exposes and endpoint which you can configure; this would be the endpoint exposed externally to your services' consumers; on the other end it would consume your back end services using the same or other bindings (power comes with responsibility!); in between it would runs some logic to be provided by MS, that would evaluate some rules (to be provided by you) to determine where the request should go to; I'm not sure what this rules would include at this point, but the sample I've seen included some configuration that included xpaths to elements in the request and expected values which suggests support for content based routing.
Again - this brings a very power feature from the BizTalk world to the WCF/WF world which is a blessing, but with the lack of a message context and pipelines in WCF things are bound to look a little bit different.
This mechanism might also be useful to handle correlation scenarios when load balancing the server farm, but I'm not sure if there isn't a built in mechanism for that separately.
I have not yet seen, but I understand that "Dublin" will introduce some support for load balancing, mostly around long running scenarios where requests have to resume with previous state; I understand that this might be done by routing the request to the machine holding the state, or moving the state to the machine processing the request (or both?), details around this seem to have not been finalised (or made public) yet.
As I've said I'll try to provide more details as I find them, and as I start to play around with the copy I've recieve yesterday (which won't happen before next week, realistically; I need to get *some* sleep!) , but - my bottom line for now - if you were expecting a revolution on how you will be designing, configuring and mostly hosting, your WCF and WF solution (over all the changes in .net 4.0 to those technologies) you might think this is bad news, but the reality is that, quite simply, no revolution was needed - you could do all this stuff before, and now it just got easier; but mostly - in my view - Microsoft are signaling the direction they wish to see these technologies go; sure three's a lot of place for WF inside your application; "Dublin" shows three's a lot of place for it outside them as well.
One last note - there's a lot more everyone need to digest as a result of the various announcements in PDC (mostly Azure and the cloud services platform), and these are strongly related. the WCF activated WF, hosted in WAS, the forwarding service with its content based routing capabilities, load balancing and better administration are key to utilising the .net services in the cloud and extending your workflow out of your app to your enterprise and into the cloud.fascinating stuff!
Now that the main announcements in PDC are happening MS are starting to release a lot more information about it all.
check out http://modelsremixed.com/ as well as http://msdn.microsoft.com/en-us/oslo/default.aspx
I'm lucky enough to attend PDC this year where Microsoft have just announced Windows Azure - the O/S for "the cloud" as well as a set of online services to be released.
Both are very big and very exciting and, naturally, very much related; as Ray Ozzie said one thing you could clearly say about Microsoft is that they have always been one of the biggest clients of their own technologies, so when they are talking about an releasing an O/S for the cloud and a a set of fairly extensive online services (yet to be seen) you can expect a big correlation between the two technologies, each influencing each other over the next few months (and beyond).
Azure, to be used by Microsoft in its own data centres, and to be made available to the paying public via commercial agreements that would be based on a combination of resources required and SLAs agreed (and met!) would allow companies to deploy their solutions (web app was one thing briefly demonstrated) to "the could" or - if you prefer a more concrete definition - Microsoft's data centres - first in the US and then worldwide, through a portal like admin console you can deploy solution developed and tested on your local development environment not much differently that any other project you would have done before; this is crucial if adoption is to be wide - and it seems Microsoft are keen on, and are on the right track, to keep familiarity, and thus productivity high as well as, obivusly, integration with existing tools.
Commissioning of more resource is a case of tweaking settings on the portal (and dishing cash, of course).
It will be very interesting to see how this get's adopted outside Microsoft, the key motivation being of course providing scalability and redundancy to applications deployed at a fraction of the cost otherwise required, as well as high flexibility in both these fields (supporting peak times, for example); but also, quite possible, simply lower cost of hosting and running the applications (for the smaller businesses?)
Even more interesting is the idea of online services - obviously these will all be hosted in the same data centres running on the same O/S so all the questions around those apply, but another later of considerations is added - what will be the capabilities of all these services? how flexible will they be? how will they perform and what would the learning curve like? how trusted can Microsoft be to make companies safeguard possible their most precious data is their data centre? and processes?
Ray Ozzie mentioned we're entering the fifth generation of software, the era of the "web tier"; there was a lot of hype about "the cloud" recently - some of it good, some of it bad - all of it suggests that we're at the brink of a big change;
I've decided to step out of the "announcements" streak and look more closely about what, I would imagine, would be the first question everyone would (or at least should") ask: how will this be secured.
I'm expecting that Kim Cameron's session on the "identity roadmap for software + services" will provide a good window into some of the aspects that need to be considered, and as I've been doing quite a lot of work recently on federated identity (posts to come) I'm particularly interested in this topic at the moment; enough to convince me to skip the "Lap around cloud services" happening next door.
Stay tuned.
Back in Match I posted this entry about creating messages "from scratch" in BizTalk.
The post started a bit of an online discussion and a slightly more intensive offline discussion about the various ways to create messages and the differences between them.
As part of that discussion, Randal van Splunteren and I have exchanged some emails and Randal took the time and effort to create a test solution to compare the performance characteristics of the various methods which I have helped validating.
Randal has been kind enough to let me summarise our findings in this blog (and it only took me 6 months...buy I have my excuses) so here it is -
The scenario we've used to test is as follows -
There is one main orchestration that takes in a ‘command’ message using a file receive location; in this command message you can define the method to create a message:
- Map with Defaults (1)
- Map with xsl (2)
- Assignment with serialization (3)
- Assignment with resource file (4)
- Using undocumented API (5)
The first four options create messages according to the four methods I described in my blog post; the fifth one uses the CreateXmlInstance API suggested by Randal as a comment on my original post.
In the command message you can also set the number of messages that must be created;
Finally you can set if the method should use caching; we've implemented a very simple caching mechanism for the assignment and undocumented API methods (caching the generated instance in all three methods so it can be re-used in subsequent calls); for the map methods the caching parameter is ignored because BizTalk has its own caching for those methods.
When a particular test is finished the main orchestration writes out a ‘report’ message (again using file adapter) which contains the number of elapsed ticks the test took.
I've ran all the scenarios 5 times and averaged the results, between each test I have restarted the host to get as much like-for-like comparison as I could, so these numbers would not reflect true runtime performance of a live server but only the difference between the approaches; initially I ran all the tests creating 1 message at a time, here are the results:
| msgs | Map using defaults | Map using xsl | Assign using serialisation | Assign using resource | Assign using API |
| 1 | 13,243,663 | 12,687,314 | 8,153,346 | 8,135,461 | 36,374,565 |
| 1 | 13,385,005 | 12,888,630 | 6,905,139 | 8,620,287 | 36,468,805 |
| 1 | 12,837,338 | 13,943,338 | 9,272,362 | 8,815,033 | 37,723,069 |
| 1 | 15,630,602 | 13,298,954 | 6,679,173 | 8,027,708 | 35,877,260 |
| 1 | 12,729,576 | 12,765,337 | 7,113,975 | 9,174,668 | 36,919,198 |
| Avg | 13,565,237 | 13,116,715 | 7,624,799 | 8,554,631 | 36,672,579 |
Then I ran all the tests again, this time creating 100 messages at a time -
| msgs | Map using defaults | Map using xsl | Assign using serialisation | Assign using resource | Assign using API |
| 100 | 15,195,199 | 15,254,912 | 9,158,223 | 8,951,018 | 231,352,547 |
| 100 | 14,421,621 | 16,523,637 | 9,259,892 | 8,700,856 | 226,704,695 |
| 100 | 15,199,198 | 15,010,499 | 8,476,670 | 10,222,202 | 232,357,798 |
| 100 | 16,725,023 | 15,684,085 | 9,110,269 | 9,866,252 | 227,806,462 |
| 100 | 15,349,885 | 14,475,857 | 9,101,879 | 10,295,228 | 226,928,786 |
| Avg | 15,378,185 | 15,389,798 | 9,021,387 | 9,607,111 | 229,030,058 |
Last I ran the 3 non-mapper versions with the caching enabled -
| # messages | Assign using serialisation (cached) | Assign using resource (Cached) | Assign using API (cached) |
| 100 | 9,696,044 | 9,478,015 | 41,350,100 |
| 100 | 8,288,120 | 10,087,574 | 37,410,620 |
| 100 | 9,156,289 | 10,473,718 | 36,493,118 |
| 100 | 8,715,621 | 10,001,671 | 40,628,198 |
| 100 | 8,289,295 | 9,951,817 | 37,919,237 |
| Average | 8,829,074 | 9,998,559 | 38,760,255 |
So, what I have spotted?
well, to start with, comparing my results with those Randal had I learnt that my laptop is much slower then his machine...(but you can't see that from the results, nor, I suspect, do you care...)
But seriously -
- It is interesting to see how, with the exception of the API scenario, there is very little difference between the generation of 1 message and the generation of a 100.
- It is quite obvious that the API call is much slower then the rest, but that does not surprise me considering the amount of work involved (getting the schema from the database, generating the instance off the XSD retrieved...)
- For that reason, it is also quite obvious that this method was the most beneficial from the use of the cache (but was still significantly slower then the others) as the cache prevented the repeating access to the database and the xml generation.
- On the same token, caching did not make a very significant difference in the other scenarios, but again- I wouldn't consider that surprising (as there's very little work involved)
- And of course - it is clear that using assignment shape to create messages using either serialisation or a resource file is indeed the fastest way (serialisation being a little faster on my machine)
I hope you find this useful and again - many thanks to Randal for all his effort in helping me get this out.
The other day I ended up developing a process that would take one of two message types (using XmlDocument as the underlying message type), something I don't usually advocate, but I agree of course that occasionally it is the right way to go.
Naturally the first thing I wanted to do is to convert the message from either format to a single message type.
I normally write my own xsl (as opposed to using the mapper) and so I knew that the template-based xsl scripts are geared towards this type of work through the use of apply templates matching logic;
In theory I should be able to create a map that takes a message of type XmlDocument and spit out a single format; the xsl script under the covers would use apply templates to match the possible root nodes and execute the correct template to map to the target format correctly.
This is only theoretical though as the compiler does not let you build a project with a btm file that has System.Xml.XmlDocument as the input type; a schema (type) must be selected.
Further more - the orchestration designer does not let you select a message as an input or output of it is not strongly typed (although that's just a designer issue, if you edit the ODX directly the compiler is happy enough, not that I suggest anyone do that)
This leaves the following alternatives -
You could write your own code to run xsl scripts, and call that from your process passing in the XmlDocument.
You could, of course, have multiple maps - each with a different input message type- all using the same xsl underneath the hood.
You could decide to hack a little bit and create your btm with any schema as the input type (selecting one of the alternatives for the input might be a good idea, or a specially created schema to indicate the scenario at hand); you could then use the transform function in an expression shape to run that map passing in the XmlDocument message as the input message;
You could do that because the transform function does not actually validate the input (or the output) against the schema at runtime, only the out-of-the-box transform shape does that, so as long as you avoid it you're ok.
Of course it has the implication of having a btm file that does not correctly represent the reality (in terms of input message), but I guess for some its easier to live with that than for others...
This was waiting in the drawer for ages now and somehow never quite made it onto the blog...
Most of our web service's methods take one enum or another, which generally works well, but it did take us a bit of experimentation to figure out how to call them from a BizTalk orchestration; here are the details -
Take for example this version of a HelloWorld web service:
[WebService(Namespace = "http://tempuri.org/")]
[WebServiceBinding(ConformsTo = WsiProfiles.BasicProfile1_1)]
[ToolboxItem(false)]
public class Service1 : System.Web.Services.WebService
{
[WebMethod]
public string HelloWorld(MessageType message)
{
return "Hello World; " + message.ToString();
}
[System.Xml.Serialization.XmlType
(AnonymousType=false,Namespace="http://MyNamespace")]
public enum MessageType
{
Hello,
GoodBye
}
}
(I’ve made the enum an anonymous type and given it a namespace to represent real world scenarios from my experience, see http://www.sabratech.co.uk/blogs/yossidahan/2007/02/anonymoustypes-and-serialization.html )
Looking at the test page for this web service you get an example request that looks like this –
POST /Service1.asmx HTTP/1.1
Host: localhost
Content-Type: text/xml; charset=utf-8
Content-Length: length
SOAPAction: "http://tempuri.org/HelloWorld"
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body>
<HelloWorld xmlns="http://tempuri.org/">
<message>Hello or GoodBye</message>
</HelloWorld>
</soap:Body>
</soap:Envelope>
Notice how the parameter name is message (as the name of the parameter of the method?)
And indeed if I use HttpAnalyzer to examine the traffic “on the wire” I can see the request message going through as –
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<HelloWorld xmlns="http://tempuri.org/">
<message>GoodBye</message>
</HelloWorld>
</soap:Body>
</soap:Envelope>
Which makes perfect sense.
Moving to a BizTalk project - I add a web reference from my BizTalk project to generate the proxy i.e port type, schemas, message types etc.
The request message-type generated has a part for my enum, as expected, however, unlike what I'd expected, to pass an enum value to the web service, you don't wrap the enum value in the parameter name as it appears on the wire (and in the test page), which would have been -
<message>Hello</message>
but instead the value is wrapped in an element that matches the enum type definition, as such -
<ns0:MessageType xmlns:ns0="http://MyNamespace">Hello</ns0:MessageType>
This is quite obvious once you look at the schema that got generated -
<xs:schema xmlns:tns="http://MyNamespace" elementFormDefault="qualified" targetNamespace="http://MyNamespace" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="MessageType" type="tns:MessageType" />
<xs:simpleType name="MessageType">
<xs:restriction base="xs:string">
<xs:enumeration value="Hello" />
<xs:enumeration value="GoodBye" />
</xs:restriction>
</xs:simpleType>
</xs:schema>
but I just didn't bother initially, expecting to need to build the parameter the way it appears on the test page.
So in order to call the web service from a BizTalk orchestration you would need to create a message with the correct xml in an expression shape;
I use (with doc being a variable of type XmlDocument) -
doc = new System.Xml.XmlDocument();
doc.LoadXml("<ns0:MessageType xmlns:ns0=\"http://MyNamespace\">Hello</ns0:MessageType>");
and then I assign it to the request
WSRequest.@message = doc;
(where WSRequest is a message of the web message type generated as part of the add web reference procedure, and @message is the part that was created to represent the enum parameter)
One of my clients have been using this 3rd party's software for some time now (I won't name neither organization for obvious reasons).
In their organisation both BizTalk and this software are key components and so they both play part in many scenarios.
Unfortunately, and quite surprisingly, this 3rd party's software doesn't have have any adequate support for integration and the main way to interact with it is pushing some data into 'import tables' in the software's database and then running some EXE to process them. quite horrendous.
To make matters worse this EXE can only allowed to run once at any single time so most implementations require some form of queuing and singleton pattern.
Recently that third party released a version with *some* support for web services, and so, excited by the great news the development team went on to implement a process using the newly introduced web service.
The service that had been exposed relates to the product's reporting features - users can create custom reports using the product's UI and then, using web services, export the result of these queries.
I would have expected, based on past experience, a generic schema in the WSDL that would describe a report's output, irrespective of the fields returned, something like -
<row>
<column fieldName=".."></column>
<column fieldName=".."></column>
.
.
</row>
But the vendor's schema included some generic field from their domain (report name, time of execution, etc.) and then rows and fields based on the specific report generated.
Because the report's structure is not known at design time its fields do not exist in the WSDL, instead a report node exist with an xs:any declaration to include the actual report's contents.
Personally I prefer to avoid xs:any if possible, and I think a schema describing a generic report's result could have been created, but that's not the main problem; the main problem, in my view, is that the fields added underneath the report element (as well as the report element itself), which are generated by the application, all belong to the same namespace.
Because they could not predict the names report designers will use for fields it was more than possible to create an element with duplicate meaning which is a bad idea overall and also causes quite a bit of headache when one needs to create schemas for BizTalk to accurately describe the response of such a service.
One thing I will grant them is that they have though of defining LAX validation for the report, so duplicate elements will not cause validation errors.
You've created your map, selected your schemas and (if you're anything like me) set the custom xsl path to the file containing the xsl script you've worked hard to create.
You then go to the orchestration and add a transform shape, but when you select the messages the designer notices you have selected different types to those in the map and kindly offers to change the types in the map for you; you think - how generous and kind...
However, it goes one step further, and without much notice, it is re-setting the custom xsl path (to nothing, that is), which would result in an empty output from the map when you run the process and one very frustrated BizTalk developer.
To be fair it does suggest that "some links may be lost" before doing all of that and ask for confirmation, and of course you can see their ("they" being Microsoft) side of the story - they assume that if you change types in the map the links/xsl may no longer be relevant...but I would argue that if I chose to use the same map, then they probably are, otherwise I'd start a new one....
The next UK SOA BPM user group meeting is scheduled for the October 14th in London; check out the details in Mike's blog.
Microsoft has announced the next major version of BizTalk - BizTalk Server 2009.
Planned to be released 'H1 CY09', the new version, previously referred to in some circles as BizTalk "R3", adds support for the latest Microsoft stack - Windows Server 2008, SQL Server 2008, Visual Studio 2008, .net framework 3.5 and TFS 2008 - including automated builds and continuous integration as well as virtualisation using Hyper-V.
There are also further investments in B2B (EDI, SWIFT), RFID, enhanced LOB adapters and support for UDDI 3.0
Next CTP is planned for sometime in the last quarter of this year, while previously Microsoft announced that BizTalk '6' will be part of the 'Oslo' wave this version released ahead of 'Oslo', which is great news for all of us longing to move to the latest platform with BizTalk Server.
See the announcement here
I would expect this one to do wonders for some of us out there trying to give a little bit more "juice" to our BizTalk impelementation - grab it here.
If you wanted another example where good error handling in code could save some time, here's one -
Last week I was making some changes to a small process I've had for ages; the process was all working when I wanted to extract a small bit of information from the message and decided to use the XLANG/s xpath method to do so.
I planned to get the actual xpath string from a helper class (I always do, to help with maintenance), so I've create a variable for it, but as a temporary step I've put the xpath as the initialising value for the variable.
When trying to build the project I've received an error in the error list - "Unexpected EOF". not very useful.
This could have taken me ages to figure out, but luckily I knew the only thing I changed was adding that variable, so I changed the initialising value to TEST and tried to build; this time the error was much more meaningful -
"identifier 'TEST' does not exist in '<process name>'; are you missing an assembly reference?"
as well as -
"cannot find symbol 'TEST'
Just out of curiosity I tried again with the value of TEST THREE WORDS, this time it looked like
"identifier 'TEST' does not exist in '<process name>'; are you missing an assembly reference?"
and
"cannot find symbol 'TEST'
as well as
unexpected identifier: 'THREE'
and
expected ';'
So what do I make out of it -
To start with I should stop forgetting that the values for string variables should be provided in quotes
That having a single quote in a variable value results in the rather cryptic "Unexpected EOF" error
That BizTalk could have done slightly more to validate values of variables at design/compile time and save us developers precious time :-)
Both have been officially released today and can be downloaded here.
It is worth mentioning though that this time (although not the first time, and certainly not the last time) this "service pack" contains significantly more than just fixes and improvements - there's quite a few significant enhancements and new features in it.
Amongst others that includes the long awaited entity framework and the ADO.net Data Service. see more details here
A lot has been written about Zombies in BizTalk, but somehow they never really caused me too much problem. until today.
In one of our processes we are calling a web service to perform a certain lookup.
That web service’s implementation (whether correct or not) is to return the instance of the class if found or null otherwise.
When the service returns a null BizTalk receives an empty message, which triggers an exception in the process -
Microsoft.XLANGs.Core.MissingPartException: The XLANG/s message has no part at index '0'. The total number of parts found in the message is '0'. If you expect a multipart message, check that the pipeline supports multipart messages such as MIME.
I’m quite all right with that, BizTalk is definitely correct – we have returned an empty stream as a message which means nothing.
And so - we've added an exception handler around the call to the service and our orchestration’s flow carries on correctly all the way to the end shape. as expected.
However, contrary to the expected at this point the orchestration instance gets suspended by BizTalk with the error - “The instance completed without consuming all of its messages. The instance and its unconsumed messages have been suspended.”
I believe this behaviour has been introduced in BizTalk 2006 - 2004 was quite “happy” to have the so called "zombies" in the process, and it was up to the developer to handle those (or not, as the case often was); since 2006 this had become an error and if a process ends with unconsumed messages the process gets suspended and the unconsumed messages are available for inspection.
Generally, I think that making a zombie an explicit error is a good idea, although it would have been nice to be able to set whether this behaviour is required on a per-process basis, however - in our case we found this very problematic - we were in a position where we couldn't consume the message properly (due to the exception handler) but although we have handled the exception we ended up with a suspended in stance as it appeared as a zombie. we could not win.
I have opened a support call on this one, waiting to hear if the product group agrees that this is incorrect behaviour or not (I seem to be getting conflicting thoughts). I'll try to keep this post updated when I get an answer.
(You can read more details about this here - http://technet.microsoft.com/en-us/library/bb203853.aspx and http://blogs.msdn.com/biztalk_core_engine/archive/2004/06/30/169430.aspx
An interesting point is that this case is not listed in the list of causes for zombies in these articles, which makes me wonder – should an exception raised through a receive shape, which was properly handled by an exception handling result in a zombie?)
I’ve been fortunate enough to have to go through a bunch of suspended items on our production server from the last week and do some impact analysis – understand what failed, why it failed, what is the impact on the business, and how we can mitigate (technically or non-technically)
I have to admit this is very interesting work, but I suspect only until it gets somewhat repetitive...
Anyway, this has made me come up with a new wish list item for BizTalk, this time for the admin console –
I was going through all the items in the list, for each one I had done some investigation, and possibly raise some queries with others to understand the process/impact/etc.
Getting answers takes a while, so I end up having a bunch of suspended/dehydrated items waiting there until I can get around to resume/terminate them.
To make matters worse - often in BizTalk server you would get more than one instance in the admin console that relates to a failure. the send port might be suspended, the orchestration sending the message might be suspended as a result and another orchestration that called that (or that is waiting for a correlated response) might simply be dehydrated)
All of this made it harder to know, at a glance, where I stand - what have I looked at, what haven't I checked yet, what is this one waiting for and which other process it relates to.
I think there are a bunch of feature that can be added to the admin console to make this work much better.
Things like being able to relate several suspended items together, tagging/colour coding items to categorise them, adding notes to items to indicate where in the investigation process I am etc.
I'm aware that there are several tools out there to help track support tickets etc. but they lack the relation to the tools admin use to monitor the application - in our world this would be primarily the BTS Admin Console!
I'm almost embarrassed to post this one, but it send me chasing windmills for a couple of hours, so if I can save that from any other unfortunate person it served a purpose I guess -
I have a call from one orchestration to another.
In the calling orchestration I wanted to catch any exception that would occur on the called orchestration (or anywhere else down the line) and so I swiftly wrapped the call in a non-transactional scope and added an exception handler.
Being the keyboard-fan that I am (BizTalk is completely unfriendly to us keyboard types, but still...) when the type selector opened I did what I always do - I types the name of the class I wanted (System.Exception) and pressed enter.
The problem was that I did not type the class name ('Exception') but the full name ('System.Exception') and that got me System.SystemException class as the selected type, only that I did not realise that at the time. apparently the type selector ignores special characters being keyed in?!
It was much later that I realised exceptions that were thrown in my called orchestration were not being caught in the calling one and even then it took me a good couple of hours to realise what was the reason (I simply did not look carefully enough at the exception type - System.SystemException just looks like System.Exception when you (well - I) glance at it!
Three take ways for me -
- Pay more attention when doing even the most trivial tasks. (trivial, isn't it?)
- Consider being less zealous about using the keyboard when doing BizTalk development
- But seriously - consider always having a general exception handler to catch whatever exceptions were not caught by other handlers. I should have been mostly covered by catching System.Exception (although BizTalk, unlike C# or VB.net does support exceptions that do not inherit from System.Exception, see Charles Young post on the subject), but this could have helped at least identify the problem.
Michael Stephenson has suggested here that in his opinion the pipeline as a component is obsolete.
While something told me as I was reading this I could not agree with him 100%, I simply could not put my finger on the reason, I could not come up with a good enough argument against it; in fact - I still can't - which probably means he is right, but I have thought of the following two points -
The first one is not a justification, but a possible explanation of how things came about - as far as I understand it pipelines have always meant to be created at design time, then "compiled" and deployed; in 2004 there wasn't a straight forward way of using per-port configuration of pipelines, but soon enough the information that this can be done, and how to do so, became public and some developers (not all, obviously) have started using this method.
It was only in BizTalk 2006 that the ability to configure a pipeline was exposed in the UI; I guess that this points out that it was not the planned use initially (and also that MS definitely "listens" to the community, or market - call it as you wish)
It is worth noting that Michael does not stop at the pipeline component's configuration, but suggest the entire pipeline, i.e. - which components exist in the pipeline and in what order - should be configurable; this is taking the per-port configuration to another level, but it does make sense.
The second point I have in mind is that there's one benefit of having the pipeline as an artifact of sort, and that is re-use (and I'm guessing this was at least one of the considerations by MS, when making the decision to make the pipeline a compiled component) and in my experience I do find pipelines fairly highly reusable artifacts.
But this last point is not a very strong counter-argument (not that I intended it as such), just expanding on Michael's opinion and suggesting that the latter should be provided similar to ports, which are completely configurable through binding files, but are re-usable.
If we could create a pipeline in the admin console as an entity on it's own right (or through bindings) and then select it in ports, we could have the best of both worlds, mostly.
There would only be one thing I can think of that we would be missing out on, and that is the rich design environment we get when creating the pipelines in the pipeline designer - and I'm not talking necessary about the drag-and-drop way of adding components, I could not care less about that if I'm honest, but the rich property editor which is not currently supported in the generic per-port-pipeline-configuration.
Again - I'm sure this can be handled in the same way that adapter settings are handled in ports, given enough attention.
So - Michael - I am officially supporting your idea - where do I vote?
I like visual designers. what I don't like is visual designers that try to protect developers from themselves; they always seem to go just one or two steps too far.
I find myself (and others, I believe) complain fairly frequently about the orchestration designer not letting us do this or that...and while in many cases it all makes sense (you really can't, or shouldn't be doing that), being over protected can cause a lot of confusion and waste precious time. here as example of something that happened to me a couple of weeks ago -
I was configuring a start shape in one of my processes, and when I tried to select the orchestration I wanted to start, it simply did not appear on the list of available orchestrations.
It took me a while - I rebuilt the assembly with the started orchestration and refreshed the assembly with the starting one - twice, but could not get it to appear. I've tried everything - including resorting to restarting visual studio -but could not see my process.
Only when I checked carefully the parameters requested by the started orchestration did I spot the problem - the orchestration had a 'ref' parameter.
Apparently orchestrations that have out or ref (I suspect they are pretty much the same in BizTalk) parameters do not appear on the list of processes in a start orchestration shape.
This makes sense really, as the start shape is asynchronous there's no way to use ref/out parameters with it. I just think a better design decision would have been to flag such selection as an error (at design and compile time) and not simply hide the irrelevant orchestrations.
This way it is immediately clear why I can't select the orchestration I planned to, I don't have to figure all of this out every time.
The first meeting of the UK SOA/BPM user group is just around the corner on the 17th of July.
You can find all the details here
See you there.
Everyone gets this error in the event log every once in a while -
The Messaging Engine failed to register the adapter for "<...>" for the receive location "<...>".
Please verify that the receive location exists, and that the isolated adapter runs under an account that has access to the BizTalk databases.
Mostly I get this when I publish a new BizTalk web service and forget to selected the correct application pool (using windows service 2003).
Often this comes up when setting up an XP (or Vista) workstations for the first time and forgetting to sort out ASPNET permissions.
Sometimes it is simply a typo in the receive location's url, or a problem the script that was supposed to enable it, all of those are pretty much suggested through the event log entry message.
But earlier this week I had another case, one which never happened to me before (and, arguably, shouldn't have happened at all, but if it did for me, there must be another idiot who would do the same! :-)), and that one is not mentioned in the message -
I have published an orchestration web service, selected the correct application pool and made sure the receive location was enabled but still this error would be logged.
What I didn't do (I blame it on having a cold) is to actually create an instance of the host under which the receive location is configured to run; I had a host configured and selected but no host instance...
You've created your pipeline component, added a few public properties for good measure (to make it a bit more flexible of course).
You went on to create a pipeline with your component, set all the properties you've introduced and deploy it to BizTalk Server.
You then configure a port to use the pipeline (and the contained component) and test your scenario. all is sweet.
But then - you decide to add one more property to your component.
You quickly go and change the component, adding the public property you've missed, and, if you're anything like me, you decide to skip the whole: update pipeline, remove from port, undeploy, redeploy, re-configure port etc. and simply GAC the updated component planning to set it's value through the admin console to save the pain (and time).
However, when you open the port in the admin console and going to set the newly added property in the pipeline configuration you are into a surprise - your new property does not appear in the UI.
You double check everything - check your code, re-build, re-gac and re-open the admin console, but it's all the same - the property is simply not there. you bang your head against the wall. twice.
And then it hits you - it's not the component! it's the pipeline!
A pipeline "source code" is essentially an XML document. pipeline components, and their properties, are XML fragments within this XML.
When you add a component to the pipeline, its information, including all known properties (and their values) are added to the pipeline's XML.
And yes - you've guessed it right - when you're editing a pipeline's configuration through the admin console, the source of the generated UI you see is that XML, not the actual components' assemblies in the GAC.
If, like me, you have just GAC-ed a new version of the component and went to the admin console to configure it, you're up for a disappointment as your newly added property will simply not be there.
you will be forced to re-deploy the pipeline (unless you are happy to change xmls in the management database, that is - and you shouldn't be).
A quick one -
If you're using one of the DirectEventStream BAM API and you're seeing an error "Flush failed to run", check the connection string (and related permissions) to the database.
Happened three times to me recently (don't ask!) - all of which were related to configuration problems.
This is somewhat of a recurring theme with me recently, but I want to discuss the contents of the management database; more specifically I want to discuss the fact that schemas get deployed to it and that most other things deployed will have a strong dependency on schemas.
As schemas are always at the bottom of the dependency chain, this means is that on top of the expected difficulties one can experience when needing to change schemas and the impact on other system, the actual act of deploying a new schema.
At best this is simply an annoyance to a developer who needs to re-deploy his entire solution as the schema evolves through the development cycle (versioning is not applicable in this scenario);
At worst this is an operational nightmare if a solution has to be updated/patched/evolved where a good versioning story does not exist (as is all too often the case, not that versioning would have solved this all).
As we are forced to remove the entire solution and then re-deploy with the new schema, we can expect, from my experience, the process to take quite a while for large solutions, which may take the business offline for a couple of hours.
Taking the risk of making a point about something I don't know enough about - the internal behaviour of BizTalk server with regards to deployed schemas (but one could say this is often the case...) - I would argue that as far as I can tell, schemas are not actually used all that often by the runtime.
(and because I accept I could be completely wrong here, please do share any thoughts/ideas/comments/insights/whatever on the subject - put a comment on this post or email me if you prefer. I'd love to hear some feedback on this.)
Anyway - as I was saying -
When you define a message type you select the schema at design time, and the designer may refer to that schema to do various things - draw the map designer, check validity of assignments in expression shapes, build intellisense, it would even check serialisation an de-serialisation attributes on classes vs. your schema when you try to assign a .net class to a message in an expression shape, but as far as I'm aware, the schemas are rarely used by the runtime.
At runtime, when message is received into an orchestration (and set to a pre-defined message type), it's contents are not checked against the schema; neither does it get validated at the end of a transform or message assignment shapes.
When you run a map you select a schema, but again - that map could well return something completely different; BizTalk couldn't care less.
When do I know schemas get used? in the pipelines. sometimes.
If you're using the XmlDisassembler for example it would try to resolve the message type based on the message's root node and namespace, and then try to get the schema from the database.
the disassembler may then use this schema to promote some properties, if configured it may debatch the message according to the schema and possibly use it to validate the message; all are very valid usages for the schema but - they are not always used, and they require specific configuration, either in the schema at design time or in the pipeline component (or both).
Also, at least with regards to property promotion, all that get's used is a bunch of xpaths provided in an annotation in the schema, not the actual schema information.
There are, of course, other cases where schemas are required - FlatFileDisassembler, XmlValidation, Xml and FlatFile Assemblers all need schemas for their work (to some extent at least) and definitely the design time environment uses them extensively, but what I'm arguing is - can we do without having to deploy schemas if they are not used?
BizTalk works in a late-binding fashion anyway, where assemblies and their contents are loaded from the GAC/database as needed (and may be unloaded after a period of them not being used), couldn't we get away with only deploying the schema when it is needed at runtime, and simply 'register' message types when it is not?
In fact - even if a schema is needed at runtime - why does it need to exist in the database? how is it different from maps, pipelines, orchestrations? all of which are 'known' to the database but physically exist only in the GAC? (well, that's not accurate - the orchestration's structure is stored, as XML in the database, but that's to be displayed in HAT, and possibly a bad design decision on it's own)
I can't help thinking I'm missing something, I'm sure the guys behind BizTalk's decision had given it a lot of thought and found good justification for it, wouldn't they? anyone can comment on what those might be?
One argument could be that BizTalk wants to know which messages are 'supported' by the solution - just as a message arriving with no subscription is considered an error, a message arriving which is not of a known 'type' should be considered an error. but in a sense - the two are the same, and in any case BizTalk is quite happy to support 'blob' messages through the use of passthrough pipelines and XmlDocument as a message type in the orchestrations.
Here's another one from the archives (=the list of things I have waiting to be blogged)
At some point we had a sudden peak in system load on our BizTalk processes and, as a result, our BizTalk solution that was running so nicely seem to have gotten "stuck".
In "stuck" I mean - we ended up with lots of processes in "Active" state, but they did not seem to be active at all; a closer inspection (of trace that should have been emitted) showed that although the instances status says "Active" they were all very passive indeed - nothing was executing on the server - close to 0% CPU and no trace whatsoever.
This is where you might expect me to describe the long hours we've spent investigating the issue, the sleepless nights and empty cartons of pizza... - but really what happened is that, not being able to afford any more down time, we called out premier support which turned out to be a great thing because the first thing they did (well, not literally, but anyway) was to ask us to check the state of the server using the MsgBoxViewer which in turn pointed out that we have simply "max-ed out" our memory consumption throttling level.
You see - we use a lot of caching of data in our processes; mostly because we access a lot of reference data frequently - data that does not change very often; this is by design. what we forgot to do is estimate the amount of memory this caching will require when many different clients use the system and adjust the throttling level accordingly.
As you can see from the image below - out of the box the BizTalk hosts are configured to throttle at 25% of the server's physical memory. the idea is to prevent the BizTalk processes from taking up too much memory and killing the server, and the assumption is that if throttling kicks in, and stops processing instances, memory consumption will slowly reduce until the server gets back to a more healthy state. however - from it's very nature - caching does not really release memory that often and so instances have stopped progressing but no memory was released as a result and so we got "stuck".
In our case, the solution was straight forward - as we know our memory consumption will be high, and we know there's nothing else running on the server to compete with that memory consumption (more or less) we could increase the threshold to 50%, which is enough to grant BizTalk Server enough memory for the caching and all the processing requirements.
In the process we monitored the situation by investigating two BizTalk performance counters - "Process memory usage threshold" (here shows as 500MB) compared to "process memory usage" (here showing around 130MB).
As long as there was large enough gap between the two we knew our processes are going to be just fine; it is always important, of course, to monitor these over time to ensure there's no memory leak in the processes, which we have done, on top of peak load tests - which we have not.
Now, while all of this is down to a test or two we may have neglected on our side, there are a couple of interesting points at the back of this from a product perspective -
- We were confused by what we saw mostly because of the "active" state of all instances (and we had quite a few); we would have diagnosed the problem much quicker, and on our own, had the admin console indicated that the server is not actually processing anything due to it's throttling state.
- I can't help but wondering whether the throttling mechanism couldn't be a bit more clever and identify it has reached a dead end and is not actually helping in improving the situation. following on our case the engine realised memory usage has gone too high and has stopped processing instances. wouldn't it be great if after, say, 10 minutes it realised that memory is not actually reducing and so it will never exit the throttling state and would write something to the event log?
Again - not trying to make any excuses, just thoughts with the power of hind sight...
Microsoft have just publicly announced BizTalk Server 2006 R3 on Steve Martin's blog here.
Not much more to say on top of that I guess...that's the whole story.
With BizTalk server every DLL we use has to be in the GAC.
Much too often, after making a change to such DLL, I forget to GAC it in time before running a test which would, naturally, result in the test failing; in most cases the error is obvious and I simply have to return to the infamous build-gac-restart host cycle before running my test, but every now and then I get thrown by the error and do not realise it is a simple case of me forgetting to GAC an assembly.
To avoid those annoying moments I have developed a habit of adding the command to GAC an assembly to the project's post-build event.
The command required looks like this -
"$(DevEnvDir)..\..\SDK\v2.0\Bin\GacUtil.exe" /i "$(TargetPath)" /f
and it can be used to make sure that on a successful build the assembly generated will be added to the GAC.
Of course this has downsides - it is quite possible that you do not want to GAC on every build; further more - the post-build event is part of the project properties and as such goes into whatever source control you're using - now you have to worry about all those other developers who might get that code and build it - do they want it in the GAC?
Having said all that I find that - for me - having the post build event is usually better than not having it.
Ewan Fairweather, a good friend, had pointed out that my RSS feed only published the first paragraph of any posting and that it puts him off reading my blog (as if any reason is required...).
Anyway - I completely forgot I left it at this state last time I played around with the blog's settings, and so I apologise to anyone this has caused any inconvenience; I believe this has now changed and so all the posts' contents is available directly from the RSS feed.
The list of shapes in the BizTalk toolbox is not a very long one and so - 4 years after BizTalk 2004 was released I find it strange to discuss the behaviour of a simple shape like the terminate shape; the thing is - that I never quite paid attention to how useful it can be, and so I figured others may have missed it as well.
Now - it is not like I'm talking about anything revolutionary here, just a small oversight on my part.
In essence the terminate shape is as straight forward as anything can be really - you put it in your workflow and behold - your process, upon reaching this shape, would terminate!
The shape takes one "parameter" - a string (or anything that returns a string) which would be the "reason" for termination.
I've known for a while that this string would appear in the HAT query results list as the reason for the termination, and so it is quite useful from operational perspective (why did that process stop again?)
However, I always claimed that terminate is really intended to be used for error scenarios, and not to situations where the process simply reached a point where it should stop processing; I have event, in a previous post, suggested introducing a new "end" shape; in that post I have mentioned a possible alternative - using the Terminate shape - which was also suggested in a comment by an anonymous, but as I believed that the shape is really meant to be used for error scenarios I felt this was not ideal, but at the time this was more a gut feeling thing then anything else.
Today I noticed for the first time the label used for the termination "reason" - it is called "error info", which to me is the "proof" I needed (at least to satisfy myself) that the terminate shape was indeed intended to be used for error scenarios; same error info appears when you view the message flow in HAT of a process that has been terminated , at the top section with all the general details about the process you will find an 'error info' label; any text provided for the terminate shape will be shown there.
I've spotted this only because, being the hard headed guy that I am, I have a few orchestration that have a terminate shape as the last shape in the process. "what is the point in that??" I hear you ask...well - this is why I looked at my decision again.
Well - there are a few possible scenarios - here's one - we have a process that is exposed as a web service. the process initiates several sub-processes (using a mixture of call and start orchestration) and then returns to the caller with a response.
If we have errors in the process we keep track of them in a helper object and return them as a soap header (with or without a response) so that the client is aware of them.
using the terminate shape at the bottom of the orchestration (if my errors collection is not empty) I can report the errors to HAT and make them visible to our operators as well; instead of the orchestration showing just showing as completed, they get a hint through the status that something did not go smoothly and by inspecting the error info field they can find out what it was.
yes - I know I can use the event log - but this way it gets logged with the process in HAT which we can easily find.
I'm pretty sure I'm not alone suggesting that the HAT tool is somewhat, let's say, lacking....
There's quite a few annoying things about the tool, but there's one thing in particular that has to be at the top of the list, because, in my view, it means that in the one case that you really need HAT to help you out, it fails you miserably.
The "orchestration debugger" is a nice selling point for BizTalk: you develop your process and, assuming you have the relevant tracking settings turned on you can go back to processes already completed and "replay" them to see which shapes have been executed and which haven't.
This is really great when viewing processes already completed, and also somewhat useful when setting a breakpoint in the process and attaching to the process in HAT (although not as useful as one might think).
However - it is completely useless when dealing with suspended orchestrations.
If your orchestration get's suspended for whatever reason, you get a nice error message in the event log, in most likelihood the event log message will even contain the name of the shape in your orchestration in which the exception occurred; however - find the suspended instance in the admin console or in HAT, open the orchestration debugger - and you're into a surprise: the viewer will only show you execution up to a few shapes BEFORE the actual shape that failed.
I'm not sure I have the story right, but I believe this "bug" (it's really a "side effect", more on this in a second) was introduced in 2006 (but I no longer have 2004 installed to prove it was actually better before hand).
As far as I know, one of the changes in BizTalk 2006 is around the way orchestrations handle exceptions - in BizTalk 2004 all unhandled exceptions in an orchestration would (if my memory serves me right) result in a suspended non-resumable instance; in 2006 these instances are resumable; this suggests that BizTalk 2006 has to keep the state of the orchestration BEFORE the error occurred - so that if an administrator chooses to resume the process (possibly after fixing whatever caused the suspension) the process could start again and retry the action where it got suspended before).
In order to achieve this BizTalk probably keeps the last GOOD state of the orchestration in the database (from the last persistence point executed); in other words - where before a suspension would cause a persistence point, from 2006 it does not and the orchestration's last persistence point is what's kept in the message box.
If that is correct it would explain why the orchestration debugger only shows information up to a point before the shape that caused the suspension - it would only have information up to the last persistence point.
I don't know if this was an oversight when releasing 2006 or a conscious sacrifice, but I think it's a big pain point; it would have been great to see it all - see where the exception occurred, see the state of the orchestration at this point as well as having an indication as to where was the last persistence point - so we could tell what will get executed when we resume the orchestration.
There are quite a few good uses for HAT - it's a great tool to know what's been executed on the server over time; it's not a bad tool to take a look at the duration it took for a service to run, it's even somewhat useful to check the flow of a particular message through the engine using the message flow or the orchestration debugger view - but when it comes to helping out finding out a cause for a suspended orchestration - it is quite pointless for that reason.
So if you counted on it bailing you out when your process fails - you may as well switch off orchestration shape tracking for your processes.
It is generally known, I believe, that BizTalk has somewhat of a steep learning curve.
BizTalk server is by no means a simple product, but that's ok because we tend to do complex stuff with it :-)
Microsoft, over the years, from version to version, invested a lot in making the server easier to learn and use, some improvements were made in the UI, but mostly through investments in the documentation, tutorials, examples and "community content"; I do believe this has made BizTalk much more accessible.
There are still quite a few things in BizTalk which, from my experience, tend to confuse new starters; one of them is the concept of orchestration ports and port type and the "New Configured Port Wizard".
I find that too many developers use BizTalk without fully understanding the fact that it is a strongly typed system and without understanding the relationship between ports and port types (and messages and [multipart-]message-types).
This is not helped by the fact that you can quite easily develop an orchestration without explicitly defining either.
The new configured port wizard defaults to creating a new port type whenever you configure a port; I suspect many developers never give it a second though and simply create a type for each port they use; this way you can easily create quite a few copies of the same type and not even recognize it.
further more the wizard creates the port and the port type, but only completes the port type definition when you connect the port to a receive/send shape that has a message configured (the message-type portion); I believe this further confuses people as 1) it does not make it clear that the message type is indeed part of the port type definition (along-side the access modifier and the message exchange pattern) and 2) as the port type definition does get completed when you create the link, many developers do not understand why they cannot replace the message in the receive/send shape, or connect another receive/send shape to the recently created port (because the message types may not match).
If I had to guess I would say that the reason for the way the wizard works is the assumption that it lowers the entry barrier to developing BizTalk processes as people can develop processes without understanding the concept of types) but, in my view, here lies the problem - developers produce code they do not fully understand with all the problem that creates.
A couple of months ago we needed to deploy our existing solution to a new BizTalk group to serve a different client.
For various reasons we have decided to dedicate a BizTalk server for this client, so we don't have to worry about the impact of deploying changes to our existing clients etc, but to share the SQL server.
Obviously this is not the most ideal setup, but as the new client is in a completely different time zone to the existing ones, and both environment are not (yet) expected to have high volume of traffic, we could be quite confident that the SQL server can support both environments (one during the day, the other at night)
Anyway - BizTalk is quite happy to support that and you can easily configure it to use different databases on the same DB server using the configuration wizard and it all works quite well. nothing to report.
That is - until we wanted to deploy our BAM activities on the server.
For the sake of this discussion let's assume that when we've configured the first BizTalk group we left the default 'BAMPrimaryImport' database name for the BAM main database, and that for the second group we use 'BAMPrimaryImport2'.
Using bm.exe deploying BAM activities is usually a no-brainer and the tool takes all the hassle of creating tables, relationships, views, indexes etc as well as registering them all in the BAM repository.
However, the tool also generates SSIS job for each activity for purge-and-archive purposes; these jobs are simply named based on the activity name they support and are then deployed to the SSIS server (which is not partitioned by an entity such as database as far as I know), and this is where we faced a problem:
As the first group already had our BAM Activities deployed, the corresponding SSIS jobs were already deployed to the server with the generated names; when we came to deploy the activities for the second group, bm.exe went on to try and generate the SSIS packages using exactly the same (generated) names and failed saying such packages already exist.
As far as I know there is no way to control the names of these packages bm.exe would use and so we were a bit stuck.
Fortunately - changing the name of the existing jobs was fairly easy in the SQL management studio, and as they are not referred to by anything (other than a schedule to run in SQL) was fairly safe and so - what we did was to rename the packages created by the first group, so that bm.exe would create the packages required for the second group with the old name without failing.
An orchestration deployed into BizTalk server can be in one of four states -
- Unenlisted (unbound) – the process has been deployed to the server, but is unconfigured (host and/or port bindings are not set), unsubscribed and is not running.
- Unenlisted (bound) - the process is configured, but is still unsubscribed and is not running.
- Enlisted (stopped) – the process is fully configured, subscriptions have been created, but it is stopped.
- Started – process is ready to run (and will do so when activated by a message)
One example I’ve seen goes something like this –
XmlTextReader xtr = new XmlTextReader("books.xml");
XPathCollection xc = new XPathCollection();
int onloanQuery = xc.Add("/books/book[@on-loan]");
XPathReader xpr = new XPathReader(xtr, xc);
Then, in order to get the value the stream needs to be read -
while (xpr.ReadUntilMatch())
{
Console.Write("{0} was loaned ", xpr.GetAttribute("on-loan"));
}
you can, of course, replace the Console.Write with any action required on the data located, you should also have a check to see exactly which xpath was hit (if you have more then one in the collection)
The problem with this is, as most of you may well know, that it requires that the component reads the entire stream in it's execution.
This, actually, has two disadvantages – one is performance - assuming this is a receive pipeline BizTalk will have to read the stream anyway to write the message to the message box (ina send port the send adapter will do the same). If we could avoid the need to read the stream ourselves, and simply event on the stream as BizTalk’s internals read it we would significantly improve the performance of our pipeline.
Secondly – since we’ve read the stream, we may have a problem now when we go back to return the stream to the pipeline; some streams we might receive are not seekable (such as anything coming from the HTTP or SOAP adapters) and so we can’t simply rewind them and we surely can’t return a stream pointing at the end of the message to the pipeline. It is enough to read Charles Young’s great series of articles about receive pipelines in BizTalk 2004 (http://geekswithblogs.net/cyoung/articles/12132.aspx) to see what sort of issue you might face.
Although some of these issues have since been address the underlying problem remains and that is that by reading the stream we have to then return a “touched” stream to the pipeline which may or may not cause issues, and as we can’t always be sure in what context our component will be used (can you assume send vs. Receive? Can you assume a particular adapter will be used, can you assume port maps will not be used?) we should look for a better way to do this. Luckily such a way exists that helps in most circumstances – BizTalk’s XPathMutatorStream.
Luckily for me Martijn Hoogendoorn already wrote about it a couple of years ago, check his blog entry here I just thought I’d point this out.
Obviously this stream was designed to allow replacement of values, but nothing prevents you from setting the output parameter to the input parameter and thus avoid any changes.
As you can see in the post using this stream means you never actually need to read the message’s stream yourself, you simply need to wrap the original stream with an instance of the xpath mutator stream, add the xpaths you want to the collection and return the wrapped stream with the message. When BizTalk will read the message’s stream it will now be reading your stream which would raise the appropriate event when your xpaths are being hit. Marvellous!
From a performance point of view I have not looked at the implementation of the stream so I can’t say for sure it is faster than reading the stream completely in your pipeline, but my gut feeling says it is, but definitely from robustness perspective this solution is much better as it eliminates all the problems one might encounter by reading a message’s stream in the pipeline.
When is this solution not good – when you need an entire XmlNode. This stream would be able to return a single value (element or attribute I believe), but if you need the entire contents of an XmlNode (with child elements or various attributes) it would not server you well.
Initially one might think this does not make sense, but I find myself doing it quite often actually; two typical scenarios are when one has to create a message before branching the process to satisfy the compiler (and avoid the “use of unconstructed message” error), a second is when one needs to return a message that simply contains few values obtained from, say, a database call, or some calculation etc.
In this case often it is simpler to create the shell of the message with the elements/attributes required and then use xlang’s xpath function to push the values into the right places.
Whatever the scenario is, there are, I believe, two options to create empty messages (corresponding to the two options to create messages in orchestrations in general, actually) –
1. Using a map
2. Using a message assignment shape (and a helper method).
The first one is quite obvious – you create a map, pick any message you may have in the process (you are likely to have at least one message in your process already) - use that as your input message and the message you want to create as your output message; then you’re facing two alternatives – the first one is to use xsl – you can create an xsl file that effectively has the XML you want to create hard coded in it and instruct the map to use that xsl.
The input message is completely ignored (and so is completely irrelevant) and the output message is always the way you want it to look like. You are not sensitive to any changes in the schema of your input message.
The other alternative, which I like less, is to actually use the mapper; in this alternative you would probably map the root node of your input message to the root node of your output message and then set up default values in the output message for any element/attribute that you wish to include in your output message.
The reason I believe this is not as good as the first alternative is that it is less obvious how the output message would look like; one has to follow up on the nodes to see what is set (or test the map) to see the output while in the xsl alternative one look at the xsl is usually enough to show what the output is going to look like.
Either alternative you choose – I used to think that using the mapper is a better option (as opposed to using the message assignment shape which I will describe shortly) – mostly because I thought this is a more standard way to create messages, and so it is more obvious, looking at the process and the project, where such constructions take place but mostly,I believed, if you’re using xsl files to create the output it would be very easy to spot, read and change them in the solution when necessary (simply find the xsl files and change them, no need to look for anything else).
Thanks to several people, but mostly to Ben Gimblett with whom I work with in one of the projects I'm involved with and who has insisted not to follow my advice and use helper methods to create messages, I now agree that using helper classes is the better way, mostly because using them you don’t have to use a dummy input message (as you do in the map) – which, I have to agree, can get quite confusing to anyone trying to understand the process but also because, I suspect (but have not tried to prove), a helper class will perform better than the mapper option.
When using a helper class you again have to alternatives –
You could have a message assignment shape in which you call a method that returns a .net type (class) that has been generated (using xsd.exe) from your schema.
All you need to do in your method now is create an instance of that generated return type – populate whatever members you require and return it.
In the assignment shape you assign the return value from the method to your message[part] and so BizTalk will take care of the serialization and will convert the class to the schema and because the class and the schema both represent exactly the same thing the serialisation would work just fine.
The benefit of this approach over the mapper option is mostly that there's no need for any dummy input message, no need to write xml or xsl; only very simple (and quite minimalist) code is required.
The downside – you need to generate those .net classes to represent any schema you wish to return, and maintain them as your schemas evolve.
The second alternative is simpler on the one hand as it does not involve generating and maintaining classes; it does, however, require a bit more wiring –
It starts the same way – a message assignment calls a method whose return value is assigned to the constructed message[type].
The difference is that the method does not return a strong type; instead it returns an XmlDocument whose contents are loaded from compiled resource within the assembly.
The function takes in the name of the xml that needs to be used to create the constructed message, retrieves the resource from the resources in the assembly, loads it into an xml document and returns it to the caller.
I find that this last approach works best for me in most cases – all the generated xmls are in one place which makes them easy to maintain (which I liked in the xsl option), there’s very little co-ordination that’s required – only the name of the xml file (or any other key one wishes to use) must be known to the caller and the xml resource should match the schema – but as it is stored in one location AS XML this is very easy to achieve and maintain.
In .net code a developer could simply put a “return” keyword in the function to stop the execution and return to the caller; unfortunately there is no equivalent in orchestration development.
The only “clean” way to end an orchestration currently is to reach the built-in, fixed, red shape at the bottom of the orchestration designer.
This often means that decision shapes exist all over the orchestration with branches that have to drop all the way to the bottom. In large orchestrations this can be hard to follow up on.
It could have been made much nicer by allowing a developer to add as many “end” (or “return”) shapes as needed in various places in the process. Decisions, of course, will still exist, but their scope will be much smaller.
An alternative that exists currently is to use the terminate shape, but I find it to be a bit artificial – my view is that termination of processes may be relevant in exceptional scenarios, as part of error handling, etc. but not when the orchestration ends correctly (albeit before reaching the end of the schedule). The main problem with the terminate shape possibly is that the orchestration gets a “terminated” state and not “completed” which will make it difficult to report in HAT (and otherwise)
Microsoft have released a press statement today announcing that they are implementing four principals accorss their high volumes products. These principals are -
(1) ensuring open connections;
(2) promoting data portability;
(3) enhancing support for industry standards;
(4) fostering more open engagement with customers and the industry, including open source communities.
So - what does that mean?
Primarily it would mean Microsoft would publish a lot more information about APIs to their products and protocols used within and between them, allowing developers to do even more with the products and integrate their own products/system more closely with Microsoft products.
It also means, or at least that is my understading, that Microsoft would now officially allow develpoers to use these patented protocols for non-commercial use or even for development of commercial code; distribution of commercial code that uses those protocols does require a license of course.
It seems that Microsoft, as part of this newly self-enforced commitment, are about to release a lot more information about their products, including, for example, how their are implementing variuos industry standards and when they have a variation or extension of the standard that would be made clear and public.
Some products, like Office for example, which already moved to a more open format of documents in it's latest version, will be opened further, by providing a new set of APIs that would allow third parties to control a lot more of the application's behaviour including, apparently, supporting other formats which would ultimately mean you could use word, for example, to author document in any format.
This is very exciting and it shows that Microsoft is listening to the public and is more than willing to engage better with the community (and it only took a few years of being chased down by the european courts :-)), but seriuosly - I have to admit that from my personal perspective Microsoft has been providing more than enough information for me to be able to do my work and advance my knowledge and expertise; through the MSDN web site, conferences, various partners programs, MVP etc I had more information that I could swallow.
I'm sure many do not share that feeling (and then there are those who just like to hate Microsoft), and I hope this very exciting move will help change a little bit of that mood.
On a conference I attended recently someone asked one of the Microsoft guys if they ever plan to slow down a bit with the amount of new stuff they release (products, frameworks, SDKs, documentation, standards) which leads to almost infinite time one needs to spend learning all of that; the answer, as you can imagine, was categorically no. but while I don't think anyone in the room epected it to be any different, I also don't think anyone anticipated such a big move on that front.
However, as stubborn (and purist?) as I may be I have to admit that some cases just call for the odd manipulation (or creation) of messages as classes in .net helper functions.
And so - it is very fortunate that BizTalk is intelligent enough to allow us to cast one to the other without any fuss.
To those of you who do not know yet it is quite possible to have the following -
A serializable class in any .net language.
A schema that correclty represents the xml representation of that class.
A class (idealy a different one) with a function (usualy static, but doesn't have to be) that accepts a paramter of the class above.
An orchestration that has a message of the schema above.
An expression shape that calls the method in that second class passing the message.
BizTalk will take care of the serialization of the message and will pass an instance of the class to the helper method.
Similarly it is possible to return a class from a function and use that function in an assign shape to create a new message (and, of course, the two approached can be combined as well)
Anyway, we've known that for a while now, and it has worked wonders for us; this technique makes both the functions and the orchestrations that need to use them so much simpler; but recently I nearly went mad when I tried to do just that and kept getting casting errors at build time for - as I thought at the time - no reason at all.
For some reason BizTalk was unwilling to accept that my schema and my class are one.
only after quite a few nerve racking attempts did I find out what I did wrong -
All of our classes have serialization attributes to control the xml generated; mostly we're making sure we use attributes whenever possilbe (as opposed to the default element serialization) and that the attribute/element names are short-ish.
In the particular class I used we had an XmlRootAttribute that set a (different, shorter) name to the root element representing the class; as soon as I removed the name from the attribute (it is optional) VS was happy to compile my orchestration.
It appears that BizTalk needs the root element of the class to have the name of the class; no shortcuts there.
By the way - a few months ago I've published this article suggesting that passing XlangParts as paremeters to helper functions may cause a memory leak.
As this would have prevented us from using this wonderfull casting technique we've asked Microsoft to kindly confirm if indeed my understanding at the time was correct and that one should not pass an XlangPart as a parameter to a function.
As I understand it now, and although it is not very clear from the KB article I've referred to in my post, passing XlangParts is ok as long as the lifetime of the message/part is not shorter than the the lifetime of the parameter - in other words - as long as you don't keep the message or message part in memory in your helper class you should be safe.
But anyway - I was working the other day making some changes to one of our classes and, as most of our classes have both code and schema representation, I went on to generate the schema from the modified class.
So - I've built the project with the class to generate an assembly, and in Visual Studio command line I swiftly 'CD'-ed to the bin folder the project and used XSD.exe to generate the schema in order to replace my existing one to reflect the recent changes.
However, to my great surprise, the generated schema did not have any of the added nodes I expected to see - it looked suspicously identical to the old schema I had.
It took me quite a few attempts before I've realised what's happening - as you can imagine from the title of this post the answer lies in the GAC - I had the older version of the assembly containing class in the GAC and so XSD.exe, although being pointed by me a specific assembly in the bin folder, has decided to pick the old one from the GAC and use it to reflect the class.
As soon as I removed the old version from the GAC XSD.exe was happy enough to generate the correct schema for me.
In the BizTalk Administration Console you can run a query to view subscriptions (which is a huge step forward from the old subscription viewer for BizTalk 2004, I have to admit)
You can, however, only filter based on the subscription type (activation or correlation), the service name or the service ID.
What that means is that while you can find all the subscriptions that start a particual orchestration, you can't find all the orchestrations that would be started by a message arriving through a particual receive port, which would have been quite useful, don't you think?
What's even worse though, is that even if you were happy to do a lot of manual work, there doesn't seem to be any practical way of finding out this information using the admin console, although it is all there in the database -
When you bind an orchestration recieve shape to a receive port, the subscription is is created for you using the receive port's id rather than the port's name, so it would look something like -
http://schemas.microsoft.com/BizTalk/2003/system-properties.ReceivePortID == {E1E7FE08-D421-4D5A-8CD8-CA51E25FA508}
This is not very readable, but, unfortunately there is a much greater problem with it -
There doesn't seem to be a way to know a receive port's id through the admin console; so even if you did have the patience required to go through all your receive ports, matching their GUIDs to the one in the subscription hoping to figure out this way which ones will get your orchestration activated, you would find it impossible to find out which receive port actually has the id -'{E1E7FE08-D421-4D5A-8CD8-CA51E25FA508}'
The only way to find this out is to go to the management database and lookit up yourself in the bts_receiveport table, baring in mind, of course, that this id will change on the next deploy.
I guess there's some learning curve around it, and that most first implementations of any BizTalk developer do not make much use of it (as they often start with all ports being directly bound) and that it takes a while befor a BizTalk developer and the organization involved establish a good architecture and move more towards using true publish-and-subscribe, losely-coupled, approach to implementation.
However, the existing model is not perfrect; in my view (and I suspect it is shared by many) it has two main weak points -
The first point is quite an obvious one - there would be a latency associated with any implementation of publish/ubscribribe mechanism;. in the BizTalk case it involved writing the message and it's meta data (context) to the message box (a SQL database) and having a separate process locate newly published messages, figuring out which subscribers need to receive a copy of the message and manage the activation/correlationthe of message-process interaction (as well as keeping a list of references for house keeping etc).
Reading and writing to the database, the the polling interval of the subscription evaluation process, etc. all introduce latency, which, in certain scenarios, can be crucial.
If to believe the fractions of information floating around regarding Oslo then we might see an in-memory pub/sub mechanism in future version of BizTalk (in addition, not as a replacment to the existing model I suspect) which, while will no-doubt come with a price (persistance, and therefore scalabiltiy and durability to some extent), will no-doubt make supporting low-latency scenarios much easier.
As for the second point -
At first look the pub/sub in BizTalk is very flexible; in all the BizTalk demonstrations I can remember from the past the presenter would create a recieve port and a couple of send ports and will edit the subscriptions of those ports in the administration console to show how easy it is to create content-based routing in BizTalk server and configure it at runtime.
In BizTalk 2006 you even did not have to restart the host to speed things up (as you did in demos with 2004), it happens pretty much instantly.
However, the case with orchestrations is not that simple...
The subscription for orchestrations is specified as a a filter in the properties of the initalizing receive shape in the process; this gets compiled into your assembly together with the process, and will be used to create the subscription when you deploy the orchestration.
As far as I know, short of manipulating the management database yourself (which would not be supported) there's no way to change those subscription at runtime.
If you want to change the subscription you have to change the filter in the orchestration, build, undeploy the old version and deploy the new one (or version the process and perform the side-by-side deployment)
This is, in my view, an un-necessary pain, in dynamic organizations (aren't they all?) that require changes often; and to that extend developers had to find a solution to the "I need to be able to change that subscription from outside the process" requirement.
That solution is often adding some routing metadata to messages in the form of context properties ('nextProcess', 'Operation', etc.) which would be set by publishing processes and/or pipeline components and use these in the filters (rather than the actual content data).
So you could often see a pipeline component, often driven by some external configuraion, that would check for certain bits in the message or it's meta-data and set these context properties based on the values it found; the premise is that pipeline components are easier to replace, but also - thesee components often use database or a rules engine in one form or another to decide what goes into the message context and by doing so introducing real flexibility as is advertised.
What all of this means is that we, developers, end up developing a pub/sub mechanism on top of the existing pub/sub simply because we need flexibility the product does not provide.
I don't like this apprach, but I end up doing this myself occasionally, simply because I have to.
I could possibly understand why MS has decided to do so - there are benefits to editing the subscription expression within the orchestration (known types would be one thing), and also - one could argue that the process subscription is part of the process design and so changing it is likely to involve code changes as well which will require a re-build, but really - I think we would all have benefited from the ability to edit the orchestration subscription in the same way we can edit send port subscriptions - through the admin console.
Anyway, as you can imagine we're using web services quite extensively - we expose a lot of them, and consume even more; some are internal to the company (but cross teams, although not so much platforms) and many are external (which do cross platform as well)
The reasons to use service oriented architecture should be very clear to everyone by now, as are the famous four tenants of SOA.
In out implementation we've abstracted the calls to all the internal web services through utility orchestrations which would take a message in our canonical format , convert it to the service's format, call the web service and transform the response to the canonical format before returning it to the calling process; this way we can re-use those transformations, and have a central place to deal with each request, apply error handling, etc.
From the parent process we then use call orchestration to initiate these utility orchestrations passing in the request cannonical message and receiving the response canonical message as an out param, which is quite efficient (when initiating orchestrations through the call orchestration shape the request does not go through the message box)
As we're doing the transformations in these utility processes, we consider them to be in the boundary of our process, and not, obviously, within the boundary of the called service, for this reason we call the web service from the process rather than the actual assembly behind the web service.
When we, within the utility process call the web service, what actually happens is that the request message (now in the WS' format) gets published to the message box, being picked up by the send port which would pass it to the SOAP adpater which, in turn will serialises it and transmits it over the wire to the service; the service then deseralises the message on the other end before executing whatever code needs to be executed and the entire process now repeats in the opposite direction.
In this case the service boundary is the web service endpoint.
A few weeks ago I had what I thought was a brilliant idea - why not treat the message box as the service boundary!?
If I had a process that takes in the service's format of the message using a directly bound receive shpae and a filter, execute the code internally (as we're now inside the service boundary we can use the service code directly from expression shape, no need to go through a web service) and when finished publish the response back to the message box (in it's own format), I could have simply published a request message for that service, and get the response published back for me; correlation should be used, but this can be handled using self-correlating ports or a correlation set.
The client process would do pretty much the same - it would use a utility process to transform the canonical format to the service's format and publish the request. it would then use correlation to receive the response and transform it back to the canonical format before retunrning it to the calling process(synchronously).
What would we save? - following this approach for at least some of our internal services can save us the need to serialise the messages over the network; in the web service case we have to go through the message box from the process to the send port anyway, so going through the message box from one process to aonther would not make a difference, but all the network traffic and the work by the SOAP adapter (which is far from being efficient) can be saved.
This was a good idea (I thought anyway), but I suspect it won't work, as it has two main flaws (and I will be extremely happy to get some ideas around those) -
Firsly - both subsystems will need to exist on the same BizTalk group so that they share the same message box and so we could use pub/sub to exchange messages between them (on it's own this is not necessarily a problem, but it is the main cause for the next one, which is the big one)
Secondly - the schemas will have to be shared -
When you're adding a web reference to a web service from a standard .net project a proxy gets generated for you; that proxy will include a local version of all the classes used by the web service (these will be in YOUR code namespace rather then ther service's but will serialize to the same XML).
Equally - when you add a web reference in a BizTalk project, you get schemas generated so you can create messages to send and receive to/from the web service; these will be in the service's XML namespace as they have to represent the XML supported by it, and here lies the problem.
If both the service implementation and the client implementation are on the same BizTalk group, the schemas will have to be shared as there's no way to deploy two schemas using the same root node and namespace and we all know that sharing schemas is a bad idea as it strongly couples the implementation together and that pretty much renders the idea useless (this, confusingly I suspect, means we're sharing a class and not a contract).
Of course one could play around with the idea of having two BizTalk groups and communicating between them, and although you can choose better transports than SOAP for that internal communication I suspects that brings us closer to simply calling the web service and so I'd rather stay with that standrad approach.
I've received a good question today -
"we had a little debate in the office today - what is faster - running a map with pure xsl or the standard way with functoids, what you think?"
As I've blogged before - I'm a big supporter of writing custom XSL and not using the Mapper and Functoids in anything other than the simplest of maps; so - although performance is only one of my arguments - the answer should be obvious.
Nevertheless I'll take the chance to answer properly again, although I suspect the question is not accurate enough -
At runtime there's no difference between the two; the Mapper generates XSL (which you can see by "validating" the map in visual studio and following the link to the XSL file generated which would appear in the output window, so the question should be, in my view, whether the Mapper can generate as good XSL as a developer could, but as you can imagine the answer really depends on a particular scenario - how many functoids are you using? how are they working together? what's the size of the map? what's its complexity?
Anyway, in my view there is a bottom line answer to that question and that is that under most real-world scenarios custom written XSL will almost always be better than generated one, but I'll try to explain a little bit more -
When you're using Functoids in your map you're generally doing one of two things - you're either calling external assemblies or you're adding some XSL lines to perform some actions for you.
The former one is easier to tackle - if you need external assemblies you can call them from custom XSL as well (as I've explained here ); as the Mapper will do exactly the same, the performance impact will generally be identical in both cases (using mapper or custom XSL).
The latter is harder to tackle, as there's no one-rule-fits-all statement one can make - but here's a shot at it -
The Mapper is a visual, generic, designer that generates code.
As is always the case with these tools they come with a price, and that price is often the quality of the code generated; now - don't get me wrong - I don't argue that the Mapper is bad, or that it always generates bad, slow XSL; but if you know XSL well, there's no doubt you will write better code than a generator will.
When you're adding a Functoid that does not call an external assembly you'll be doing one of three things -
All three are perfectly fine, and even more so - if you'll try them out you'll see that the designer does generate quite a nice XLS in all cases.
The problem starts when, and this is inevitable in the real-world, the maps get more complex.
Once you move out of the playing ground and into real scenarios, the maps get more complicated and the inefficiency of the generated code becomes both more apparent (as multiple Functoids need to work together to achieve the desired output the XSL gets 'uglier and uglier') and that inefficiency becomes a greater problem as it is repeated many times over a large-ish map.
Bottom line is from my perspective - if you feel comfortable with XSL (and the rest of the team) - you will always achieve better scripts than any generator would so use it. If you don't feel comfortable with XSL - learn it! It's easy! (and in the mean time use the mapper).
Recently though, Mike Stephenson, a great person and a very smart cookie, has pointed them out to me, and not in a very flattering way -
Mike has found an article he published on his blog appearing in TopXML without any reference to his name; judge for your self - here's the TopXML link and here is the link to Mike's original post
To makes matters worse, when Mike - who's a nice bloke as I've already mentioned - pointed this out to TopXML politely he received he following (rather stupid) response -
The blogger (you) is clearly listed at the top of the page
"Blogger : Geekswithblogs.net"
and at the end of the post we provide another link to you:
"Read comments or post a reply to : What is returned from my 2 way
port when a fault is used?"
I am providing TWO clear links back to you and giving credit to you.
We're increasing your fame! :)
Hmm...makes sense of course...(or it would have been if Mike's name was indeed 'geekswithblogs' but I don't believe it is)
That kind of started me thinking - from a blogger point of view - one could argue that, assuming your name does indeed appear next to your content when it's being aggregated or replicated in other web sites, there's little harm; if anything - you might be getting more exposure (assuming TopXml has higher rank is earch engines than one's blog)
On the other hand, seeing my content in sites like TopXML takes it out of it's context. it does not appear next to other (possibly related) posts of mine, next to links to other content on my site/blog, next to my about box (and the MVP logo), next to my favourite links and references to other bloggers, or whatever else I chose to put on there.
Further more - if I was trying to make money out of advertisements on my site, it denies my of that benefit as well, so to put simply - copying other people's content - without permission - in a systematic way (I'm not about the odd blog post which refers to someone else's post) with or without referencing the source is very wrong in my view.
Usually, my exception handlers in the orchestration are quite short; this time, however, I wanted to do a bit more, which included calling a web service when a particular expcetion is caught.
While implementing this I learnt something interesting (which, arguably, I should have known a long time ago – just to show you how difficult it is to catch up on all the changes in the .net framework) -
In.net framework 2.0 a Data property of type IDictionary was added to the Exception class, which by it's own is not a problem, only that IDictionary is not serialisable and so could have proved rather difficult to anyone using Exception, especially in a BizTlak environment.
Luckily (but not surprisingly) the .net framework team have implemented ISerializable in the Exception class, which helps, but does cause a small headake to the unexpecting BizTalk developer (me).
But first - I have to apologise - again I'm not familiar with all the details around this, and am resulting to pure guesses of a couple of points (will be happy to get more information if you care to enlighten me), still - I'm sure this will be useful to most people...
When you mark a class as [Serializable], as the runtime deserialises a class it attempts to call a parameterless constructor to create an instance of the type; the serialiser will then populates all the members of the class through their public properties (I suspect that this is, partly at least, while Xml Serialisation serialises public members only).
When working with ISerializable, however, the runtime expects a constructor that takes SerializationInfo and StreamingContext as parameters; it is expected that the constructor will populate the members out of the SerializationInfo collection.
I believe that the runtime interrogates the type to be deserialised and, once it finds that the type or any type in its inheritance path implements ISerializable it takes the second approach mentioned.
Not realising the Exception class implements ISerializable ,I did not have the expected constructor in my class, which meant that when BizTalk tried to deserialise the object (between the send shape calling the web service and the receive shape expecting the response) it failed, which now exaplains the error reported in the event log -
The constructor to deserialize an object of type ‘[custom exception class name here]’ was not found.
Adding the constructor with the two parameters to my custom exception class allowed it to pass the deserialisation with no errors; however – I was now facing a second problem – after indicating that my class implements ISerializable and addin the constructor required the members of the Exception class, from which my class inherited, including the Data member now deserialised correctly; my own class' member,however, did not.
There are two ways to overcome this - I could have simply mapped my properties (after all I only had a couple of strings to keep with the exception) to the Exception's Data property (have the getter and setter of each property use the collection internally, and so all the data will be capture in the Exception base class and so serialised with it, or - I could implement ISerializable fully which really only means
1. Firstly - adding my members to the serializationInfo member of GetObjectData:
public override void GetObjectData(SerializationInfo si, StreamingContext
context)
{
base.GetObjectData(si, context);
si.AddValue("member1Name", member1Value);
si.AddValue("member2Name", member2Value);
}
2. Secondly - populating the members back in the constructor:
protected MyCustomException(SerializationInfo info, StreamingContext context)
: base(info, context)
{
member1Name= info.GetString("member1Value");
member2Name= info.GetString("member2Value");
}
Voila! it all serialises and deserialises ok now. if only I didn't have to spend a whole day to figure this out!
MSDN now supports wiki at parts - and most importantly for anyone reading this blog (I suspect) - in the BizTalk 2006 R2 documentation.
Check out this page and see how Eric Stott has kindly pointed out a huge improvement that was mentioned almost as a by the way statement in the docs. way to go Eric!
So here ya go, now there's an easy way to share thoughts, ideas, additions and corrections to the msdn content. brilliant!
Last week Oleg has found yet another elegant and simple way to do this -
tasklist /SVC /FI "IMAGENAME eq btsntsvc.exe"
run this from a command line or add as an external tool in Visual studio (to get the result in the outout window) and you will get a list of all the BizTalk hosts with their description and process id, so you'd know which one to attach to.
Using the name of another pioneer inventor (Alessandro Volta), What is Microsoft trying to hint?
Well - Volta is described as "an experimental developer toolset that allows developers to build standards-conformant, multi-tier web applications using established .NET languages, libraries and development tools"
From the little I've seen, it brings web development even closer to any "classic" .net development, bringing even greater separation between code and presentation on code side, while at the sime time allows an even smoother user experience, and - from a developer's perspective - it allows writing code first, and deciding whether it should be executed at the client or on the server later.
The idea is that you write you code, with everything running on the client (which makes it easier to debug), and - as you get closer to the release - you move some attributes around and code will be refactored to be split to execution between the client side and server side.
Naturally this means that Volta kindly takes care of all the communication and security code between the tiers, which removes the need for the developer to deal with all that "plumbing"; they even throw in some instrumentation code that lets you view the trace of the execution between the client and the server using the WCF Service Trace Viewer Tool.
So - in a somewhat simplified statment - Volta is about refactoring your code to split it between tiers, and is then about hosting the server code.
Hosting, I believe, is done a this point in a "Volta Server" executable; I'd expect this to evolve significantly as Volta matures into much more robust hosting options, and as MS already done a lot in this area quite recently it's not difficult to see where this is going.
The refactoring is done on the MSIL code generated during the build of your project, and not on the actual code, which is a nice approach (and means you can definitely use any .net language (as they should all end up with the same MSIL, right?! :-) )
The goal of Volta, and the reason most of this is happening "behind the scenes" is to reduce even further the amount of things we have to deal with when building multi-tier web application. is this going to work?
Like most develpoers, I guess I'm a little bit of a control freak when it comes to my code, I don't even like code generators and wizards, so the thought of something taking my code and fiddle with it - split it, add a bit of this and a bit of that, and generate client side javascript to describe my classes is a bit scary; but then again - isn't that just a normal phase one has to go through in the face of innovation?, well - I guess it depends on how good the innovation is :-) I'll have to wait and see how this evolves.
There's no doubt in my mind that the code generation bit, and probably the client side more than the server side, is the achilles heel here, and with Volta only being out a few hours, people already started complaining that the libraries are too big, they are to slow etc.
What's important to remember, when looking at all of this, is that this is still very much experimental - as far as I know (but I don't know much :-) )there isn't a product roadmap yet that has Volta as a clear part of it.
Think of this preview as a way to get involved early with stuff MS are playing with, and if you do - write about it, get as much feedback as you can out there, it will only help MS get the feeling of what's working and what's not working, to make sure that when it does find it's way into a product of some sort it will deliver.
So - tt is not surprising that, even by Microsoft's own admission, Volta is not yet optimised; the javascript generated is not the most efficient or most elegant that can be created, and probably the same can be said on the server side code and the communication layer (but I haven't looked, so I can't possibly comment).
This will definitely improve as MS keeps working on this, and as feedback is provided by us.
Go and play!
check out www.joinmicrosofteurope.com
Anyway - the text box in the UI restricts input to digits only, so that you can only ever enter numeric values, but they must store it as a string because when I sort my work items on Rank - "5" is greated then "20", "05" is not.
Am I missing anything obvious or have they?
It is also often said that BizTalk is a strongly typed system, although, I suspect, this refers mostly to the orchestration environment and less to the messaging environment as almost anything in an orchestration, as it is in programming languages such as the C# or VB.net, has a type (class) and an instance.
For example - when you want to declare a port - you are really first declaring a port type, and then a port of that type; when you create a correlation set you first define a correlation type, and then a correlation set of this type, etc.
Unfortunately, when it comes to messages, quite possibly the most fundamental item in an orchestration, the picture gets a little bit more confusing -
In an orchestration you can create a message out of four "things" -
While all four options follow on the type-instance approach, which keeps the statement that an orchestration is a strongly typed environment correct, it can be somewhat confusing to understand to those beginning with BizTalk Server.
However - what's furthermore confusing is that there seem to be a gap between what a message type means in the orchestration environment and what it means in the messaging environment, and there's a potential for confusion where the two meet -
As discussed earlier, in the messaging world a message type is typically the root node and xml namespace when dealing with Xml, when dealing with .net types the same applies if you consider the xml that is a result of a serialisation of the class as what determines the message-type (and not the class itself) it is the serialisation attributes that will determine the namespace and root node used and as such the message type used.
The question is - what is the story with multi-part messages in the messaging layer?
Well - the basic answer is quite short - in the messaging layer - the namespace and root node of the body part of a multi-part message is used as the message type.
This, however, can cause some problems with subscriptions and developers must be aware of this difference of approach between the two layers -
Lets see if I can demonstrate this with an abstract scenario -
I create a solution with two orchestrations - ProcA and ProcB
I also add to my solution two schemas A.xsd and B.xsd
In ProcA I take in a message of type A.xsd in an activating receive
In ProcB I take in a message of type B.xsd in an activating receive.
I've effectively created two subscriptions for these messages, pretty basic.
Now, imagine I have a third process, ProcC, and that this process has a multi-part message with two parts - first part uses a different schema - C.xsd and second part uses A.xsd.
Effectively, I've created a new type in my orchestration which has a name and is composed of C.xsd and A.xd (lets call the type C*)
What would happen if ProcC published such a message to the message box? We'll get a routing failure as we don't have any process subscribing to that message.
What I think can be confusing is what "that message" means in that context-
While in the orchestration a multi-part message with two parts is a completely different message from any of the other ones defined in the process, in the messaging layer, it's the body part that counts, so that the reason my multi-part message had no subscribers is not down to the fact that there was no orchestration subscribing to C* (the multi-part message) but that there was no orchestration subscribing to a message of type C.xsd (the type of the first part in the multi-part message)
And indeed - if I created a new multi-part message where the first (body) part was of type A.xsd and the second part was of schema C.xsd (call it A*) and published it to the message box, contrary to what one might expect (which would be another routing faliure, as there's no process subscribing to A*) ProcA will get triggered.
Although in the orchestration world a message of type A.xsd and a multi-part message A* are two completely different types (even if one of the parts in A* is, indeed, defined using A.xsd) in the messaging world, it is only the type of the body part that counts so that the two are the same.
Hope this makes sense
Everyone says it, and it's true - these are very exciting times to be in software development, especially in the enterprise and BPM/EAI in particular perhaps, and I can't help the feeling we've really only scratched the surface on this one, a lot more is still to come.
I'm sure we're all going to hear, read and write a lot in the next few weeks; I know I've got tons of new stuff to read on my desk now, and hopefully will get to see some nice stuff coming out of this in a few months time.
BTW, don't forget to check out the newly launched SOA web site as mentioned by Marjan Kalantar
I've been doing some reading on Astoria (and generally on the REST architecture style) recently, and although I have one serious reservation I'm about to discuss I think this is a very interesting area to look at;
What is my reservation regarding "Astoria" then? well, I'll start by saying, of course, that it could very well be simply my lack of understanding, and of course this is all based on a very first impression of "Astoria", and I'm quite certain a lot is still about to change in this area (which is why many "smaller" questions can simply be left aside for now I think), but anyhow - what I'm not sure about at this point is whether, and how, can "Astoria" and SOA fit together, as to me they seem to conflict.
I will not explain Astoria here, as there is enough information out there, but if you just take one sentence from the link provided above - "The goal of Microsoft Codename Astoria is to enable applications to expose data as a data service that can be consumed by web clients within a corporate network and across the internet. "
Also, from playing around a little bit with Astoria, and from Guy's demonstrations , I see "Astoria" as an easy way to publish REST like data services to allow a client to perform CRUD operations on data entities; currently SQL Server is supported, but it is quite obvious this is going to evolve to other platforms.
So - my problem is, quite naturally I should think, with the CRUD bit; I like services, and I am learning to like REST (although have not used it in practice anywhere yet), but over the last couple of years I've been hearing all around (not to say thinking on my own right) that CRUD services are bad, and I do think there are lots of reasons as to why they are bad (some are very nicely explained here)
Do we really want a service that simply provides create/read/update/delete operations on a table, or view?
As I've mentioned before - this raises a lot of questions around security, and transactions etc. some can be solved simply by the fact that "Astoria" is built on WCF which provides support for some of these, but will it be enough? Time will tell.
Either way, from a design perspective, it seems to me that CRUD services like that are only useful, at least in my (narrow?) world, when they actually form a part of a bigger service, one that would actually provide business logic on top of the data logic, and that will "speak" in business terms rather the database entities terms.
For example, if I had a service that maintains inventory in a warehouse, and a product was to be shipped to a customer, using a CRUD-like service you would update a row with the new inventory value (so your message would look something like-[Update inventory for product 'x' with '15']);
Using a business service, the message you would send will probably look something like ['2' items of product 'x' need to be shipped to customer 'y']; the service can then update the stored inventory with the correct value (without worrying about concurrency, I might add), as well as do whatever logic is required around that, for example initiating the packaging and shipment of the product to the customer, updating other systems, etc.
Now, it is true that "Astoria" does provide extensibility points, and one can write code around that data retrieval logic provided to perform some actions before, or after the data was retrieved, but is that a good way to implement business logic? this is a very data centric, data entity aligned, way of thinking, which does not sit well with all that has been said and done with SOA to my understanding.
I like the REST approach for urls and am hoping to see more of it used, but I don't like the idea that logic is treated as essentially pre or post-processing stage on top of a data retrieval logic, nor do I think that business processes should think in data entities terms.
It would be extremely cool to expose REST endpoints to processes in BizTalk or WF, but I feel that is the only way to align REST and SOA and in my mind "Astoria" is a deviation from what I think of a good implementation of SOA.
I'm honoured and humbled and quite excited I have to admit.
I think I'm up to a great year!
And of course - a huge thank you for everyone involved...
Imagine a class that looks something like this -
public class MyClass
{
public static void test(string a, string b)
{
}
public static void test(int a, string b)
{
}
}
From an expression shape in BizTalk you can now use
MyClass.test("string","string");orMyClass.test(1,"string");
and everything compiles just fine.
but add the following overload to your class -
public static void test(string a, IDictionaryand now try to use the first overload from an expression shape (which worked just fine before) -c)
{
}
MyClass.test("string","string");and you will get - "unknown system exception" with no file name or line number to help in sight.
I know BizTalk does not support generics, but I did not think having generics in the another method, one I'm not calling from BizTalk, would be a problem; and in fact - change the call in the expression shape to the overload that takes the int first -
MyClass.test(1,"string")and it would compile just fine.
From this I can only gather that this has something to do with how .net resolves the overload to to use - in the error case, it looks at the first parameter passed in and identifies it as a string, so both the [string,string] and the [string,IDictionary] overloads fit, which means it needs to evaluate the second parameter, and, presumably, this is where it "blows up";
By changing the first parameter passed in to be an int, the compiler does not need to look into the overload that uses IDictionary as all and so it compiles ok.
And indeed, as soon as I add one this overload to the class -
I get an "unknown system exception" when calling using int as the first parameter.
public static void test(int a, IDictionaryc)
{
}







