The eXtensible Markup Language (XML)

Simple Object Access Protocol (SOAP)

The W3C defines the SOAP as "a lightweight protocol for exchange of information in a decentralized, distributed environment." [SOAP Version 1.2 Working Draft, July 2001]

There are a number of object communication protocols, including Common Object Model (COM) and Common Object Request Broker Architecture (CORBA) with support for distributed applications through Distributed COM (DCOM), Internet Inter-ORB Protocol (IIOP), and Remote Method Invocation (RMI).  Two problems  with these protocols are complexity and lack of full industry acceptance.

SOAP is a simple protocol that works with existing Internet standards (including XML, HTTP, SMTP) and appears to be gaining wide industry acceptance.

SOAP 1.1 was registered as W3C Note (indicating that it hadn't gone through the normal W3C process) after being submitted by a broad-based industry group that included Ariba, CommerceOne, Compaq, HP, IBM, Iona, Lotus, Microsoft, and SAP.  The W3C is beginning formal work on SOAP 1.2.

Elements of SOAP

SOAP 1.2 consists of four parts:

  1. SOAP Envelope

  2. The SOAP Envelope is the mandatory outer wrapper of the message.  It describes what is in the message, who should deal with it, and whether it is mandatory or options.  It contains the optional header, with additional information and the mandatory body.
  3. SOAP Encoding Rules

  4. The SOAP Encoding Rules defines serialization for application-defined data-types.
  5. SOAP Remote Procedure Call (RPC)

  6. The RPC is the convention used for remote procedure calls and responses.
  7. SOAP Binding

  8. The SOAP Binding defines a convention for exchanging envelopes (e.g., using HTTP).
SOAP Example

Here is a simple example (from the SOAP 1.2 Working Draft) of a SOAP message for notification of an alert.
                <env:Envelope xmlns:env="http://www.w3.org/2001/06/soap-envelope">
                  <env:Header>
                   <n:alertcontrol xmlns:n="http://example.org/alertcontrol">
                    <n:priority>1</n:priority>
                    <n:expires>2001-06-22T14:00:00-05:00</n:expires>
                   </n:alertcontrol>
                  </env:Header>
                  <env:Body>
                   <m:alert xmlns:m="http://example.org/alert">
                    <m:msg>Pick up Mary at school at 2pm</m:msg>
                   </m:alert>
                  </env:Body>
                 </env:Envelope>
The Envelope defines the namespace and contains two sub-elements: Header and Body.  The Header is option.  The Body is a mandatory top-level sub-element of the Envelope.

SOAP Communication

SOAP does not define the transport protocol and can be used with almost any protocol.  HTTP is a natural choice since it supports simple request/response communications and is so widely used.  For these reasons HTTP is used as an example in the SOAP specifications.  Using HTTP allows passage through many firewalls. The experimental HTTP Extension Framework can be used to further control message processing, but it is not required and is not widely used as yet.

Here is a simple example of request and response communications.  First, the request:
                 POST /StockQuote HTTP/1.1
                 Host: www.stockquoteserver.com
                 Content-Type: text/xml; charset="utf-8"
                 Content-Length: nnnn
                 SOAPAction: "http://example.org/2001/06/quotes"

                 <env:Envelope xmlns:env="http://www.w3.org/2001/06/soap-envelope" >
                  <env:Body>
                   <m:GetLastTradePrice
                         env:encodingStyle="http://www.w3.org/2001/06/soap-encoding"
                         xmlns:m="http://example.org/2001/06/quotes">
                     <symbol>DIS</symbol>
                   </m:GetLastTradePrice>
                  </env:Body>
                 </env:Envelope>
This SOAP/HTTP message uses the HTTP/1.1 POST protocol to request the /StockQuote resource.  The content type is XML.  The SOAPAction header field is required.  The namespace for the content in the body of the message is included in the Body child element GetLastTradePrice.

Here is the example reply:
                 HTTP/1.1 200 OK
                 Content-Type: text/xml; charset="utf-8"
                 Content-Length: nnnn

                 <env:Envelope xmlns:env="http://www.w3.org/2001/06/soap-envelope" >
                  <env:Body>
                   <m:GetLastTradePriceResponse
                         env:encodingStyle="http://www.w3.org/2001/06/soap-encoding"
                         xmlns:m="http://example.org/2001/06/quotes">
                    <Price>34.5</Price>
                   </m:GetLastTradePriceResponse>
                  </env:Body>
                 </env:Envelope>
Again, the txt/xml type is specified in the HTTP protocol.

SOAP Envelope

The SOAP Envelope must be present.  It may have a namespace declaration and other namespace-qualified attributes.  It may contain an optional Header ,which must come first if it is present.  It must contain a Body ,which must come immediately after the Header (if there is one) or first (if there is no Header).  The Envelope may contain other namespace-qualified elements.

SOAP Header

The SOAP Header is optional.  It may contain multiple blocks, with each immediate child element  namespace-qualified.  The Header is a generic element that allows for additional communications between sender and receiver such as authentication.  Optional attributes in header blocks include encodingStyle, actor, and mustUnderstand.

A SOAP message may pass through intermediaries.  The actor attribute identifies the target of the message, which may not forward the message as if it were an intermediary.  A different actor may be given in different header blocks.  The mustUnderstand attribute indicates whether the block must be understood with respect to duties of the recipient.  The recipient must either fully process the block or not process the message at all.

SOAP Body

The SOAP Body is mandatory.  Immediate child elements of the Body must be namespace-qualified.  The endcodingStyle attribute may be used to indicate the encoding style of the blocks.

The SOAP Fault element can appear in the Body.  It is used to error or status information and may appear only once (or not at all) in the Body.  The Fault has four sub-elements:  faultcode, faultstring, faultactor, and detail.  The soap-envelope namespace  has recommended faultcodes.  Here is an example from the SOAP 1.2 Working Draft:

                 <env:Envelope xmlns:env='http://www.w3.org/2001/06/soap-envelope'
                                       xmlns:f='http://www.w3.org/2001/06/soap-faults' >
                   <env:Header>
                     <f:Misunderstood qname='abc:Extension1'
                                                 xmlns:abc='http://example.org/2001/06/ext' />
                     <f:Misunderstood qname='def:Extension2'
                                                 xmlns:def='http://example.com/stuff' />
                   </env:Header>
                   <env:Body>
                     <env:Fault>
                       <faultcode>MustUnderstand</faultcode>
                       <faultstring>One or more mandatory headers not understood</faultstring>
                     </env:Fault>
                   </env:Body>
                 </env:Envelope>

SOAP Encoding

In messages, it may be important to encode and decode data in conformance with the particular data-types used in the application.

In essence, the two problems are:

1) Take data that may be of a type defined in the application and writing it out so that it can be communicated in a generic text (i.e., XML) message.  This is called serializtion or marshalling.

2) Receive data described in a generic text (i.e., XML) message and restore it to its proper data type.  This is called de-serialization, restoring, or un-marshalling.

The SOAP data model provides a set of abstract constructs that can be used to describe common data types and link relationships in data.

The SOAP data encoding provides for the syntactic representation of data described by the SOAP data model.

SOAP allows for user-defined encoding schemes.

SOAP Data Types

SOAP supports both simple types (strings, integers, etc.) and complex types (aggregates of multiple values such as arrays).  SOAP adopts the simple types supported by XML Schema and adds encoding elements for each type.  For example:
 <enc:int xmlns:enc="http://www.w3.org/2001/06/soap-encoding"
  id="int1">45</enc:int>

SOAP supports both array (i.e., componnents accessed by ordinal position) and struct (i.e., components accessed by name).

There are various issues in encoding including whether names are locally or universally scoped, whether data values is single-reference or multi-reference, and whether data values are independent (top-level) or embedded.  Most of the SOAP 1.2 Working Draft Recommendation is taken up with describing the encoding rules and giving examples for various possibilities.

Here is just one of the simpler examples.  Here is a small XML Schema with a complex type:
    <xs:element name="Book"
            xmlns:xs='http://www.w3.org/2001/XMLSchema' >
     <xs:complexType>
        <xs:sequence>
          <xs:element name="author" type="xs:string" />
          <xs:element name="preface" type="xs:string" />
          <xs:element name="intro" type="xs:string" />
        </xs:sequence>
      </xs:complexType>
    </xs:element>

Then, an example of struct type "Book":
    <e:Book xmlns:e="http://example.org/2001/06/books" >
       <author>Henry Ford</author>
       <preface>Prefactory text</preface>
       <intro>This is a book.</intro>
    </e:Book>

The SOAP 1.2 Working Draft Recommendation provides more involved examples.

SOAP RPC

One of the motivations for SOAP is to provide a simple, lightweight mechanism for for remote procedure calls (RPCs).  RPCs can involve a one-way (i.e., call) or two-way (i.e., call and reply) transmission.  The call and reply are both carried in the SOAP Body element.

The following information is needed for RPCs:
   Target URI
   Procedure or method name
   Parameters to the procedure or method
   Procedure or method signature (optional)
   Header data (optional)

SOAP relies on the protocol binding (e.g., the binding to HTTP) to provide the mechanism for carrying the target URI.

Both the RPC call and reply are modeled as a struct (i.e., a single compound datatype).  For the call, the struct name and type identical are identical to the method name.  For the reply, the name of the struct is not significant, but is conventionally called by the method name with the string "Response" appended.

There is an accessor for each parameter (either in or in/out).  For the RPC call, the name and type correspond to the name and type of the parameters in the same order as the method signature.  For the response, the struct has the return value of the method as the first accessor (with a not-significant name) followed by other parameters.

An call fault is encoded using a SOAP fault.  Protocol binding may add additional rules for fault expression.

The example GetLastTradePrice call and GetLastTradePriceResponse illustrate the use of SOAP RPC.