Difference between revisions of "DAP4: DDX Grammar"

From OPeNDAP Documentation
Line 1: Line 1:
 
'''Version: 1.0'''
 
'''Version: 1.0'''
  
At the end of this document are instructions for accessing and testing a formal grammar for the DAP4 DDX. I constructed it without any reference to any other explicit or implicit grammars so I could record my proposal. I have since modified it based examining the implied grammar in page [[DAP4: Data Model]]
+
At the end of this document are instructions for accessing and testing a formal grammar for the DAP4 DDX using the Relax-NG schema language. I constructed it without any reference to any other explicit or implicit grammars so I could record my proposal. I have since modified it based examining the implied grammar in page [[DAP4: Data Model]] and from comments from others and from a comparison with the xsd grammar.
and from comments from others.
 
  
A number of minor issues need to be resolved.
+
== Differences with DAP4 xsd Grammar ==
 +
I converted the xsd-based grammar
 +
http://scm.opendap.org/trac/browser/trunk/xml/dap/dap4.xsd
 +
to an equivalent relax-ng grammar.
 +
http://dl.dropbox.com/u/53929684/xsd.rng
  
1. Should all element names (e.g. <structure>) be capitalized?
 
  
2. When should we use nested element names versus an attribute whose value is a list of names. For example, the current grammar stores dimension references for variables as a list in an attribute named "dimensions": e.g. dimensions="dr d1".<br/>
+
One major difference I see is in dimension handling.
Previously I used this:
+
# I just used the name "dimension" rather than "shareddimension"; For me, all dimensions (except anonymous ones) are shared.
 +
# The xsd separates out scalars from arrays. I always allowed the dimensions for a variable to be optional to handle the scalar case.
 +
# I attempted to be as consistent as possible, so I allowed any type including sequences and structures to be dimensioned.
 +
# The dimensions of a variable are currently specified in the rng grammar as an attribute named "dimensions" associated with "variables": e.g. dimensions="dr d1".<br/> Previously I used this:<br/>
 
<pre>
 
<pre>
 
<dimensions>
 
<dimensions>
Line 15: Line 20:
 
<dimension name="d1"/>
 
<dimension name="d1"/>
 
</dimensions>
 
</dimensions>
</pre>
+
</pre> But this seemed kind of verbose.
  
3. Where should attributes be legal; currently I have them allowed almost everywhere.
+
Other differences:
 +
# The Dataset element in the dsd has a couple of extra attributes. I added these.
 +
# The xsd appears to allow attributes to themselves have attributes. This needs discussion.
 +
# I forgot enumerations and opaque. I added them.
 +
# The URL basetype is in the xsd. What is the justification for keeping it?
 +
# It appears that the Dataset contains a top level <group> declaration; I chose to treat the Dataset itself as the top-level group.
 +
# Attribute declarations appear to have their own "namespace" attribute. Not sure why this is needed.
 +
# I do not understand the purpose of the "NewAttribute" attribute.
 +
# The Grid issue, of course.
  
=== Testing the Grammar ===
+
There are also some minor differences.
 +
# Element names (e.g. <structure>) are capitalized in the xsd grammar.
 +
# There is an issue of interleaving of definitions, or equivalently, what elements must occur in a fixed order.
 +
# Where should attributes be legal; I think the rng grammar and the xsd grammar agree on this: putting them almost everywhere, but it needs discussion.
 +
 
 +
Other differences:
 +
# I temporarily suppressed OtherXML because it did not translate correctly.
 +
# I dropped Blobtype; I fail to see the need for this.
 +
 
 +
=== Testing the Relax-NG Grammar ===
 
You will need to copy three files:
 
You will need to copy three files:
 
# dap4.rng - this is the grammar file; it uses the Relax-NG schema language (http://relaxng.org/).<br/>This can be obtained from http://dl.dropbox.com/u/53929684/dap4.rng
 
# dap4.rng - this is the grammar file; it uses the Relax-NG schema language (http://relaxng.org/).<br/>This can be obtained from http://dl.dropbox.com/u/53929684/dap4.rng
 
# test.xml - this is a test file, that I am growing to cover the whole grammar.<br/>This can be obtained from http://dl.dropbox.com/u/53929684/test.xml
 
# test.xml - this is a test file, that I am growing to cover the whole grammar.<br/>This can be obtained from http://dl.dropbox.com/u/53929684/test.xml
# jing.jar - Jing is a validator that takes the grammar and a test file and checks that the test file conforms to the grammar.<br/>This can be obtained from http://dl.dropbox.com/u/53929684/jing.jar
+
# jing.jar - Jing is a validator that takes the grammar and a test file and checks that the test file conforms to the grammar.<br/>This can be obtained from http://dl.dropbox.com/u/53929684/jing.jar.
  
 
To use it, do the command:
 
To use it, do the command:

Revision as of 21:35, 22 February 2012

Version: 1.0

At the end of this document are instructions for accessing and testing a formal grammar for the DAP4 DDX using the Relax-NG schema language. I constructed it without any reference to any other explicit or implicit grammars so I could record my proposal. I have since modified it based examining the implied grammar in page DAP4: Data Model and from comments from others and from a comparison with the xsd grammar.

1 Differences with DAP4 xsd Grammar

I converted the xsd-based grammar

http://scm.opendap.org/trac/browser/trunk/xml/dap/dap4.xsd

to an equivalent relax-ng grammar.

http://dl.dropbox.com/u/53929684/xsd.rng


One major difference I see is in dimension handling.

  1. I just used the name "dimension" rather than "shareddimension"; For me, all dimensions (except anonymous ones) are shared.
  2. The xsd separates out scalars from arrays. I always allowed the dimensions for a variable to be optional to handle the scalar case.
  3. I attempted to be as consistent as possible, so I allowed any type including sequences and structures to be dimensioned.
  4. The dimensions of a variable are currently specified in the rng grammar as an attribute named "dimensions" associated with "variables": e.g. dimensions="dr d1".
    Previously I used this:
<dimensions>
<dimension name="dr"/>
<dimension name="d1"/>
</dimensions>

But this seemed kind of verbose.

Other differences:

  1. The Dataset element in the dsd has a couple of extra attributes. I added these.
  2. The xsd appears to allow attributes to themselves have attributes. This needs discussion.
  3. I forgot enumerations and opaque. I added them.
  4. The URL basetype is in the xsd. What is the justification for keeping it?
  5. It appears that the Dataset contains a top level <group> declaration; I chose to treat the Dataset itself as the top-level group.
  6. Attribute declarations appear to have their own "namespace" attribute. Not sure why this is needed.
  7. I do not understand the purpose of the "NewAttribute" attribute.
  8. The Grid issue, of course.

There are also some minor differences.

  1. Element names (e.g. <structure>) are capitalized in the xsd grammar.
  2. There is an issue of interleaving of definitions, or equivalently, what elements must occur in a fixed order.
  3. Where should attributes be legal; I think the rng grammar and the xsd grammar agree on this: putting them almost everywhere, but it needs discussion.

Other differences:

  1. I temporarily suppressed OtherXML because it did not translate correctly.
  2. I dropped Blobtype; I fail to see the need for this.

1.1 Testing the Relax-NG Grammar

You will need to copy three files:

  1. dap4.rng - this is the grammar file; it uses the Relax-NG schema language (http://relaxng.org/).
    This can be obtained from http://dl.dropbox.com/u/53929684/dap4.rng
  2. test.xml - this is a test file, that I am growing to cover the whole grammar.
    This can be obtained from http://dl.dropbox.com/u/53929684/test.xml
  3. jing.jar - Jing is a validator that takes the grammar and a test file and checks that the test file conforms to the grammar.
    This can be obtained from http://dl.dropbox.com/u/53929684/jing.jar.

To use it, do the command:

java -jar jing.jar dap4.rng test.xml

No output is produced if the validation succeeds, otherwise, error messages are produced.

-Dennis Heimbigner