Authors:  Denis Vakatov, Eugene Vasilchenko

  ID2 Protocol:

			 Request
			 Reply
			 Transmission of Split data
			 TSE Sources
			 Authentication
			 PARAM packages
			 List of Params


-----------------------------------------------------------------------------

---------------------- Request

Requests (unlike replies) must be assembled into packages (ID2-Request-Packet).
This is to allow for a convenience of sending several requests at once --
without having to up-read the replies to previously sent requests from the
same package (otherwise, if one does not do such an up-read there can be
a deadlock in the client-server communication).


---------------------- Reply

Every request may result in several replies. In this case
all but last reply will have end-of-reply member FALSE and
the last one will have end-of-reply member TRUE.

Replies can be reordered by server and even mixed with replies
on another request if client and server will negotiate 'reorder'
capability.

Some requests have nested request which will be executed by server
with arguments from parent reply if it was successful.

All these nested replies will have the same serial-number and only
the last reply among parent replies and nested replies will be marked
as end-of-reply.

If connection doesn't support streaming like HTTP all replies on request
have to be sent in one piece. In this case they will be streamed inside
of this data piece.


---------------------- Split

When TSE is splitted initial reply will contain at least two blobs:
so called 'skeleton' and 'split-info'.
Skeleton of TSE is original TSE with cut out some or all Seq-descr,
Seq-annot, Seq-data and assembly objects.
This cut out data is distributed evenly among blobs called 'chunks'.
Split-info blob contains information about distribution of data among
chunks. This allows client to determine on its side what chunks are
needed to perform a work.

Sometimes splitted TSE may be repackaged under the same Tse-id.
In this case skeleton blob will be left untouched, only chunks and split-info
blobs will be changed.
New splitted data set will have new unique chunk id set and chunks from old
splitted version will remain on server for reasonable time.
Thus, the clients which already loaded old split-info will still be using old
compatible chunks, while new clients will get new split-info and
correspondingly will use new chunks.

When resolving gi to Tse-id, server also sets field split-version which
will allow caching clients to know when to reload new version of splitted TSE.
It's convenient to use last chunk id in chunk set as split-version but
this is not required by protocol.


---------------------- TSE Sources

There are possibly several current TSE containing information about
the same gi.
As an example of current GenBank state we have a 'native' TSE with
sequence and maybe a TSE with SNP data.
So all TSE are considered to have 'source name' - string.
It is assumed that sequence itself (if any) is located in source
with empty name.


---------------------- Authentication

Client authentication can be implemented using optional 'params' field.


---------------------- PARAM packages

1) At least in the case of stateless connection (such as HTTP) the
   PARAMs must be sent with each request. Therefore, and also
   to help avoid accidental errors (typos, omissions, extras)
   in the PARAMs and maybe even save some time on parsing them on
   the server side, there will be so-called PACKAGEs of PARAMs,
   which client is encouraged to use.

2) The PACKAGE is just a pre-defined list of PARAMs.

3) It's represented as a PARAM of type `package' with some unique
   (and preferably meaningful enough) name.

4) PACKAGEs are stored on the server side.

4.1) The list of (names of) available PACKAGEs can be retrieved from
     the server using request `get-params' with the `params' request field 
     un-set.
4.2) The contents, i.e. the PARAMs (name/value/type) constituting
     the PACKAGE(s) can be obtained by using request `get-params' with
     the `params' request field containing the list of names (with
     type `package' of course, and no value) of the PACKAGE(s) which
     content you want to retrieve.
4.3) Or, you can get all available PACKAGEs along with their contents
     without specifying their names using request `get-params' with
     the `params' field set but empty.

5) A PACKAGE sent to the server is processed exactly in a way
   as if it was just a list of the PARAMs wich constitute it
   (except for case [6] below).

6) If server does not recognize name of the PACKAGE set to it by the
   client, an error will be reported, and the request will not be
   acted upon.

7) One can send (for the same request) more than one PACKAGE and/or
   PARAMs. PACKAGE(s) here will again be treated just as if they were
   expanded to the list of PARAMs which constitute them.

8) The order matters -- in case of the PARAMs with the same name,
   the PARAM most down the PARAMs list will be used by server.
   All other PARAMs will be ignored, and warnings reporting the
   conflict may be posted in the reply.
8.1) If two PARAMs have the same name but different types, then
     error will be reported, and the request will not be acted upon.


---------------------- List of Params

List of client params:

1. "ID2-version" = "9" "100" etc
2. "Compression" = "gzip,none" first one is preferred
3. "TSE-Split" = "allow"
4. "Data-types" = "Seq-entry,Seq-annot"

Server params:

1. "ID2-version" = "9" "100" etc
2. "Compression" = "gzip,none" first one is preferred
3. "TSE-Split" = "allow"
4. "Data-sources" = "SNP"
5.

-----------------------------------------------------------------------------
$Id$