DAP4: DDX Grammar: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
No edit summary
No edit summary
Line 1: Line 1:
'''Version: 1.0'''
'''Version: 1.0'''


At the end of this document are instructions for accessing and testing a formal grammar for the DAP4 DDX. I constructed it without any reference to any other explicit or implicit grammars so I could record my proposal. I have since modified it based examining the implied grammar in page [[DAP4: Data Model]]
At the end of this document are instructions for accessing and testing a formal grammar for the DAP4 DDX using the Relax-NG schema language. I constructed it without any reference to any other explicit or implicit grammars so I could record my proposal. I have since modified it based examining the implied grammar in page [[DAP4: Data Model]] and from comments from others and from a comparison with the xsd grammar.
and from comments from others.


A number of minor issues need to be resolved.
== Differences with DAP4 xsd Grammar ==
I converted the xsd-based grammar
http://scm.opendap.org/trac/browser/trunk/xml/dap/dap4.xsd
to an equivalent relax-ng grammar.
http://dl.dropbox.com/u/53929684/xsd.rng


1. Should all element names (e.g. <structure>) be capitalized?


2. When should we use nested element names versus an attribute whose value is a list of names. For example, the current grammar stores dimension references for variables as a list in an attribute named "dimensions": e.g. dimensions="dr d1".<br/>
One major difference I see is in dimension handling.
Previously I used this:
# I just used the name "dimension" rather than "shareddimension"; For me, all dimensions (except anonymous ones) are shared.
# The xsd separates out scalars from arrays. I always allowed the dimensions for a variable to be optional to handle the scalar case.
# I attempted to be as consistent as possible, so I allowed any type including sequences and structures to be dimensioned.
# The dimensions of a variable are currently specified in the rng grammar as an attribute named "dimensions" associated with "variables": e.g. dimensions="dr d1".<br/> Previously I used this:<br/>
<pre>
<pre>
<dimensions>
<dimensions>
Line 15: Line 20:
<dimension name="d1"/>
<dimension name="d1"/>
</dimensions>
</dimensions>
</pre>
</pre> But this seemed kind of verbose.


3. Where should attributes be legal; currently I have them allowed almost everywhere.
Other differences:
# The Dataset element in the dsd has a couple of extra attributes. I added these.
# The xsd appears to allow attributes to themselves have attributes. This needs discussion.
# I forgot enumerations and opaque. I added them.
# The URL basetype is in the xsd. What is the justification for keeping it?
# It appears that the Dataset contains a top level <group> declaration; I chose to treat the Dataset itself as the top-level group.
# Attribute declarations appear to have their own "namespace" attribute. Not sure why this is needed.
# I do not understand the purpose of the "NewAttribute" attribute.
# The Grid issue, of course.


=== Testing the Grammar ===
There are also some minor differences.
# Element names (e.g. <structure>) are capitalized in the xsd grammar.
# There is an issue of interleaving of definitions, or equivalently, what elements must occur in a fixed order.
# Where should attributes be legal; I think the rng grammar and the xsd grammar agree on this: putting them almost everywhere, but it needs discussion.
 
Other differences:
# I temporarily suppressed OtherXML because it did not translate correctly.
# I dropped Blobtype; I fail to see the need for this.
 
=== Testing the Relax-NG Grammar ===
You will need to copy three files:
You will need to copy three files:
# dap4.rng - this is the grammar file; it uses the Relax-NG schema language (http://relaxng.org/).<br/>This can be obtained from http://dl.dropbox.com/u/53929684/dap4.rng
# dap4.rng - this is the grammar file; it uses the Relax-NG schema language (http://relaxng.org/).<br/>This can be obtained from http://dl.dropbox.com/u/53929684/dap4.rng
# test.xml - this is a test file, that I am growing to cover the whole grammar.<br/>This can be obtained from http://dl.dropbox.com/u/53929684/test.xml
# test.xml - this is a test file, that I am growing to cover the whole grammar.<br/>This can be obtained from http://dl.dropbox.com/u/53929684/test.xml
# jing.jar - Jing is a validator that takes the grammar and a test file and checks that the test file conforms to the grammar.<br/>This can be obtained from http://dl.dropbox.com/u/53929684/jing.jar
# jing.jar - Jing is a validator that takes the grammar and a test file and checks that the test file conforms to the grammar.<br/>This can be obtained from http://dl.dropbox.com/u/53929684/jing.jar.


To use it, do the command:
To use it, do the command:

Revision as of 21:35, 22 February 2012

Version: 1.0

At the end of this document are instructions for accessing and testing a formal grammar for the DAP4 DDX using the Relax-NG schema language. I constructed it without any reference to any other explicit or implicit grammars so I could record my proposal. I have since modified it based examining the implied grammar in page DAP4: Data Model and from comments from others and from a comparison with the xsd grammar.

Differences with DAP4 xsd Grammar

I converted the xsd-based grammar

http://scm.opendap.org/trac/browser/trunk/xml/dap/dap4.xsd

to an equivalent relax-ng grammar.

http://dl.dropbox.com/u/53929684/xsd.rng


One major difference I see is in dimension handling.

  1. I just used the name "dimension" rather than "shareddimension"; For me, all dimensions (except anonymous ones) are shared.
  2. The xsd separates out scalars from arrays. I always allowed the dimensions for a variable to be optional to handle the scalar case.
  3. I attempted to be as consistent as possible, so I allowed any type including sequences and structures to be dimensioned.
  4. The dimensions of a variable are currently specified in the rng grammar as an attribute named "dimensions" associated with "variables": e.g. dimensions="dr d1".
    Previously I used this:
<dimensions>
<dimension name="dr"/>
<dimension name="d1"/>
</dimensions>

But this seemed kind of verbose.

Other differences:

  1. The Dataset element in the dsd has a couple of extra attributes. I added these.
  2. The xsd appears to allow attributes to themselves have attributes. This needs discussion.
  3. I forgot enumerations and opaque. I added them.
  4. The URL basetype is in the xsd. What is the justification for keeping it?
  5. It appears that the Dataset contains a top level <group> declaration; I chose to treat the Dataset itself as the top-level group.
  6. Attribute declarations appear to have their own "namespace" attribute. Not sure why this is needed.
  7. I do not understand the purpose of the "NewAttribute" attribute.
  8. The Grid issue, of course.

There are also some minor differences.

  1. Element names (e.g. <structure>) are capitalized in the xsd grammar.
  2. There is an issue of interleaving of definitions, or equivalently, what elements must occur in a fixed order.
  3. Where should attributes be legal; I think the rng grammar and the xsd grammar agree on this: putting them almost everywhere, but it needs discussion.

Other differences:

  1. I temporarily suppressed OtherXML because it did not translate correctly.
  2. I dropped Blobtype; I fail to see the need for this.

Testing the Relax-NG Grammar

You will need to copy three files:

  1. dap4.rng - this is the grammar file; it uses the Relax-NG schema language (http://relaxng.org/).
    This can be obtained from http://dl.dropbox.com/u/53929684/dap4.rng
  2. test.xml - this is a test file, that I am growing to cover the whole grammar.
    This can be obtained from http://dl.dropbox.com/u/53929684/test.xml
  3. jing.jar - Jing is a validator that takes the grammar and a test file and checks that the test file conforms to the grammar.
    This can be obtained from http://dl.dropbox.com/u/53929684/jing.jar.

To use it, do the command:

java -jar jing.jar dap4.rng test.xml

No output is produced if the validation succeeds, otherwise, error messages are produced.

-Dennis Heimbigner