Firstly, following four files are the input of the process, Typhoon
information in JMX (more specifically, "台風解析・予報情報（３日予報）" and
"台風解析・予報情報（５日予報）").

>  Length     Date   Time    Name
> --------    ----   ----    ----
>     3282  09-11-12 03:21   VPTW40_RJTD_20120810063949_NJ019NNA.xml
>     3273  09-11-12 03:21   VPTW40_RJTD_20120818063808_NJ047NNA.xml
>    20804  09-11-12 03:21   VPTW50_RJTD_20110804133116_NJ006NNA.xml
>    49117  09-11-12 03:21   VPTW51_RJTD_20110804130859_NJ030NNA.xml

I'm converting these JMXs into CAP, and then Atom feed.  So the first step
is to understand the structure of the input JMX.

Of course the official JMX XML Schema (XSD, published at
http://xml.kishou.go.jp/tec_material.html) gives the generic JMX sturucture.
But this is really big framework to cover all types of JMX messages,
containing a bunch of XML elements that are missing in the Typhoon
information, and some elements declared optional or repeatable in the XSD
are always-present or non-repeatable.

So we need to narrow down ("restrict" in XSD terminology) the schema to
understand the structure in real sense.  That is the job of "電文毎の解説資料"  http://xml.kishou.go.jp/jmaxml_20120615_Manual(pdf).zip as we explained
today.  The PDF we need, that for "台風解析・予報情報（３日予報）" is
http://goo.gl/mLKG8 if you can access wobzip.org.  I'm really afraid I don't
find documentation of "台風解析・予報情報（５日予報）" but the structure is
quite similar.

Anyway the PDF is documentation for human reading to have a big picture.
It's general approach of JMA to provide some sample data in addition to the
document to make the format clear and understandable without formal
language.

But I'd really like to have a computer-validated unambiguous structure as a
safer reference in programming, so I wrote the schema only for the Typhoon
information, restricted as much as possible:

>    16889  08-23-12 06:42   input.rnc
>    33897  08-23-12 06:42   input.rng

Suffixes .rnc and .rng indicates Compact Syntax and XML types of RELAX NG.
The make target "make inputcheck" does the validation for given example
JMX's.  Comments in the .rnc might be useful note on condition of optional
elements.

The next step is the core step, conversion from JMX to CAP.

>    34529  09-11-12 03:28   vptw2cap.xsl

Basically no XSLT parameters needed in the operational run.  The only one
specified in Makefile, $TESTCONV is just for testing with limited examples
(now I don't have example of VPTW40 (three-days forecast) with really 72
hours).

Following files are imported and used in the converter.  This is to compute
MD5 from some content of JMX, in order to generate UUID to use as
alert/identifier of the CAP.  JMA's project team considers that the opaque
identifier is desirable for this field.

>   127214  08-14-12 08:27   vaporoid-hash.xsl
>     3122  08-12-12 21:21   vaporoid-hash-char-encode-1.xml
>     6676  08-12-12 21:21   vaporoid-hash-uint4-bitwise-and.xml
>     6828  08-12-12 21:21   vaporoid-hash-uint4-bitwise-or.xml
>     6744  08-12-12 21:21   vaporoid-hash-uint4-bitwise-xor.xml
>     6602  08-12-12 21:21   vaporoid-hash-uint8-decode.xml
>    11358  08-12-12 21:20   LICENSE.vaporoid-hash
>       52  08-12-12 21:20   NOTICE.vaporoid-hash

The converter vptw2cap.xsl reads single JMX and writes single CAP, which
contains multiple alert/info elements.  That's because the original JMX
contains analysis (of current position of Typhoon) and forecasts for
multiple times, and each have different effective times and geolocation
(latitude, longitude and gale radius).  This is typical example of
mismatched granularity which I suspect to occur many times in the wider
project of JMX-to-CAP conversion.

Then the CAP files are validated against the official XML schema
(CAP-v1.2.xsd) and my private schema (output.rng), which is again made as
restricted as possible to be a good documentation of the structure of CAP we
are going to create.  We hope to rewrite it into some document to be called
"JMA Profile to CAP" to give guideline for CAP conversion of JMX.

>    10098  08-08-12 07:17   CAP-v1.2.xsd
>     5103  09-10-12 09:55   output.rnc
>     9668  09-10-12 09:55   output.rng

Following XSLT "parameter.xsl" complements the output schema, by validating
values of parameter/value depending on valueName.  It's basically "poorman's
Schematron", as it is painful to use libXML's implementation of Schematron
which doesn't include regular expression.

>     5939  08-23-12 10:20   parameter.xsl

Finally the CAP files are converted into Atom feed by cap2atom.xsl.  It
takes parameters $files and $url.

>     4648  08-22-12 08:58   cap2atom.xsl
>    42066  09-11-12 03:28   atom.xml

Currently the converter "splits" the input CAP so that the output contains
as many <entry> as <cap:info>.  Right now I guess that is more useful for
you, but I'm not quite sure about this choice.  It may change in the later version.

