If you look at web services or interop in general, you will sooner or later find SOAP. SOAP is an acronym for Simple Object Access Protocol.
This implies RPC and it is often associated with it. RPC stands for remote procedure call. This means to invoke a method on an remote object that feels like invokation of a method on a local object. This can be viewed rather like a static method in .Net terms.
But SOAP is much more than only RPC, it certainly can be used for this purpose, but it is not its only applicability. Instead you can use SOAP for queued messaging, request-response schemes, for one way messages, for simple information exchange or much other messaging schemes.
So what is SOAP then?
SOAP is a scheme to use XML messages for information exchange or invocation of remote methods. The beautiful thing is that SOAP is platform, transport protocol and programing model independent. This means you can use it in any language, with any protocol like http, tcp, ipc, msmq, smtp and others.
To make use of this flexibility there are two XML schemes you need to know when you want to apply SOAP. The first one is WSDL (which we will look into in the aptly called post WSDL), the so called Web Service Description Language. The second XML schema is the one that describes the SOAP messages. The SOAP message can be embodied in another message body like an http request/response.
What does this post cover
- Ideas and history of SOAP
- Structure of a SOAP message
- Small comparison to REST
What does this post not cover
- XML and its structure
- Http/web request/response
- What are web services in general
- What is RPC
- UDDI for Webservices
SOAP is a protocol as opposed to REST that is an architectural style, yet both are used for web services. REST is considered by some as more extensible and scalable. It comes down to being a tool and one should try to find the right job for each tool.
As WCF maps to SOAP it is quite useful still in the .NET technology landscape.
We will now take a closer look at how SOAP accomplishes the sending of messages and how the XML messaging framework is standardized.
SOAP looks in essence like this: SOAP Sender --> Soap message (any protocol) --> SOAP receiver.
So SOAP is at its heart it is about sending XML messages over a range of network protocols and not about access of objects as the name implies (it was only held up to for historic reasons).
SOAP accomplishes this by providing XML based messaging framework that is extensible, protocol independent, and independent of programming models (again as opposed to the name Object in the acronym). We will now explore those three properties of SOAP XML messaging.
Extensibility is a key factor of SOAP. For the web simplicity is king (and I’d say for pretty much everything in coding) This is especially true for scenarios where interop is in play. SOAP can be extended for every W3C standard like WS-Addressing, WS-Policy, WS-Security, WS-Federation , WS-ReliableMessaging, WS-Coordination, WS-AtomicTransaction, WS-RemotePortlets.
Transport protocols make a distinction for control information and message payloads, which are mostly called header and body of the message. SOAP is no different in this regard as we will see in this post.
2. Protocol independence
SOAP can be used with any transport protocol like TCP, IPC, HTTP, SMTP, even MSMQ (see WCF that uses SOAP extensively). To adhere to this, SOAP needs to specify the bindings that have to be used. This is done by WSDL as we will see later in this post.
3. Programming model independence
SOAP is not tied to RPC but instead can be used for any programming model. The name implies that you call methods on a remote object, yet even though you can certainly use it for this manner, it is not the only capability of SOAP. You can use SOAP for one way operations, queued messaging, request/response cycles, solicit/response, notificitions and durable calls for long time peer to peer “conversations”.
With this three models we can look further into the building blocks of SOAP.
The SOAP messaging framework defines a suite of XML elements that describe how to set up every possible way of XML messages for exchange between different systems.
These elements are the following,
- Envelope, the outer root of each soap message
- Header (optional)
- Body (obligatory)
- an optional Faultdescription
- they are provided by the http://schemas.xmlsoap.org/soap/envelope/ namespace in SOAP 1.1.
the structure of a SOAP envelope then looks like this:
<env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope"> <env:Header> ... ... </env:Header> <env:Body> ... ... </env:Body> </env:Envelope>
SOAP messages are easy identifiable as for the Envelope root element. Also it is easy to spot the SOAP version by inspecting the namespace name of the envelope. The envelope must be the root element of the SOAP XML and needs to have exactly one Body that defines the payload. The envelope changes which each SOAP version. If you use non compliant versions you will generate a fault message.
The soap header can have multiple elements. By default it is optional, yet you can make it mandatory by using the mustUnderstand flag and set it to True.
Other header elements are the actor attribute, which defines the message path for SOAP service nodes. You can also specify the transactional behavior and much more.
The body contains the payload of the soap message and is a mandatory element. It must come after the optional headers. The semantics that need to be applied for the body can be found in the given SOAP schema. The body entails the information that is destined for the ultimate receiver of the message.
SOAP allows for zero or one Fault block. This Fault block holds information on status and error informations of the SOAP message. There are the following elements to a fault block:
- faultcode. Used to identify the fault code (see below)
- faultstring. A custom description for the error
- faultactor. If the faulting node is not the ultimate receiver, this has to be specified here
- detail. Custom message from the service.
- must understand,
- client (incorrect information/ not well formed),
- server problem
A Fault element is part of the body and looks like this:
<Fault> <faultcode xsi:type = "xsd:string">SOAP-ENV:Client</faultcode> <faultstring xsi:type = "xsd:string"> failed to locate method (Connect). </faultstring> </Fault>
SOAP encoding defines some built in rules for encoding of data types that can be defined and used in a SOAP message such as ints, floats, doubles, arrays. There are two types defined, scalar types that define exactly one value, like a numeric, a string, a date, boolean or a time value. The second type are compund types which support multiple values. Those compund types are then seperated into arrays and structs.
Outlines rules for processing a SOAP message while its being transported from sender to receiver. This allows for several nodes or hops from the sender to the receiver, where each hop, or officially called intermediary node, can have a different protocol. The receiver at the end of the message is called the ultimate receiver.
Intermediaries work as sender and receiver at the same time. This way they can intercept calls between the sender and ultimate receiver.
How it ALL translates to WCF
In .NET you can completely autogenerate your SOAP messages (and WSDL for that matter) when you use WCF. WCF maps pretty much exactly to SOAP, because it is the underlying protocol used. This makes the knowledge about SOAP quite useful because a lot of concepts in WCF are some sort of leaking abstraction over SOAP, which is no problem, because it takes away the complex XML and hides it behind WCF endpoints, bindings, addresses, InstanceContexts and so on and so forth.
See my series on WCF for how to use SOAP in a .Net context with Windows.
As stated before, the advantages of SOAP are its protocol independence, its extensibility and its compliance with W3C standards for webservices, like WS-Addressing, WS-Policy, WS-Security, WS-Federation , WS-ReliableMessaging, WS-Coordination, WS-AtomicTransaction, WS-RemotePortlets. It also supports integrated fault management with standardized error codes.
Another advantage is that it is very widely available for all standard programming languages that make interaction with it quite easy. Furthermore you can easily create transactional behavior even for distributed transactions.
SOAP can be really complex to work with, if you would need to create and consume SOAP messages by hand, due to the xml code that is necessary. Low fault tolerance for programming errors. Messagesize and computing power needed for parsing and validation are quite high, which can lead to performance and scalability isssues.
XML of the SOAP message at itself is pretty verbose, and if used as RPC, it couples the services much more than it is wanted for webservice/interop/etc.
In this post we took a look at SOAP and the way it is structured, what it does and how you can accomplish it.
SOAP messages are mostly generated by programming languages and one seldom have to create a SOAP message by hand. In .NET you will mostly create SOAP webservices with WCF.
SOAP in itself can be used for RPC or for any other messaging scheme. It is highly interoperable because of its XML messaging, protocol independence and extensibility featuers.
A SOAP message is constituted of an Envelope root element, an optional header with arbitrary number of elements, an obligatory body that has all the information that is needed to consume the Message, and an optional Fault element that is embedded in the Body of the SOAP message.