DAP4: Possible Notation for Server Commands: Difference between revisions

From OPeNDAP Documentation
⧼opendap2-jumptonavigation⧽
(Created page with "Looking to the future, it is clear that eventually our query language, or more generically our server commands must encompass three classes of comput...")
 
No edit summary
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[Category:Development|Development]][[Category:DAP4|DAP4]]
[[OPULS_Development| << Back to OPULS Development]]
==Background==
Looking to the future, it is clear that eventually
Looking to the future, it is clear that eventually
our query language, or more generically our [[DAP4: URL Annotations | server commands ]] must encompass three classes of computations.
our query language, or more generically our [[DAP4: URL Annotations | server commands ]] must encompass three classes of computations.
Line 5: Line 9:
# Server-side processing.
# Server-side processing.


I want to propose a notation for everything in the URL after the "?". I think this notation has ability to represent a wide variety of features without, I hope, being too generic.
I want to propose a notation for everything in the URL after the "?". I think this notation has the ability to represent a wide variety of features without, I hope, being too generic.
 
== Proposal ==
The notation is basically nested functions combined with single assignment variables. A semantically nonsensical, but grammatical example would look something like this.
?svc("cmd");$x=f("string17",g(h(12))),f2($x,[0:3:10])
Everything  past the "?" is in the form of a semi-colon separated list of expression lists. An expression list is a comma separated list of nested function invocations, possibly assigned to a variable. Anything that begins with a dollar sign is considered a local, temporary, variable, anything that does not look like a function call (i.e. name followed immediately by left paren) is assume to be what I will call a non-quoted string constant; standard quoted string constants are also allowed. Each function has an arbitrary number of argument expressions separated by commas.
 
BTW, the term "single assignment" means that a variable may only be assigned to once, but may be referenced as many times as desired after that.
 
One important issue involves providing namespaces for function names. That is, there will be standard pre-defined functions, server-specific functions, and even dynamically defined functions. In order to define a namespace, it is necessary to provide some kind of marker in function names. I have chosen the Java fully qualified name model in which the marker is the dot character.
 
The idea is that any function name has a fully qualified name specified using dot separators. Names that have no dots (modulo import below) are assumed to be in the pre-defined standard namespace.
 
The management of that namespace tree must be determined, with some mechanisms for allowing others to assign functions in the namespace. Also, some form of import mechanism ala Java would be desirable as a separate server command.


The notation is basically nested functions combined with single assignment variables. A semantically non-sensical, but grammatical example would look something like this.
?_x=f(17,g(h(12))),f2(_x,[0:3:10])
Everthing to past the "?" is in the form of a comma separated list of nested function invocations. Anything that begins with an underscore is considered a local, temporary, variable, anything that does not look like a function call (i.e. name followed immediately by left paren) is assume to be string constant. Each function has an arbitrary number of argument expressions separated by commas.


There would be several semantic rules.
== Syntactic and Lexical Structure ==
# A variable may only be assigned to once (single assignment), but may be referenced as many times as desired after that.
I have defined a preliminary [http://dl.dropbox.com/u/53929684/query.y syntax document] and [http://dl.dropbox.com/u/53929684/query.lex lexical document] for the ideas presented here.
# All functions have a defined "return type", which looks like a legal DDX minus certain things like groups, enumeration declarations, and dimension declarations; in addition, a function may be defined to have a "void" return type, which means it is executed for its side-effects on the server.
 
# Any expression that is not assigned to a variable and does not have a void return type will have its return value returned to the caller as part of a DATADDX.
== Discussion ==
The purpose of providing multiple expression lists separated by semi-colons is to support the a visible notation for specififying server commands separate from constraint evaluation.


== Notes ==
My hypothesis is that this notation should also be able to handle most kinds of server side processing by defining and composing functions.
My hypothesis is that this notation should also be able to handle most kinds of server side processing by defining and composing functions.


The standard projection+selection constraints of DAP2 can be represented using a special query() function whose argument is the standard DAP2 constraint [or alternatively, one could define a collection of nested functions to do the same thing], or alternatively, we could split the query part into two pieces separated by a semicolon. The first piece would be a constraint expression and the second piece (after the semicolon) would be in the nest function call form defined above.
The standard projection+selection constraints of DAP2 can be represented using a special constraint() function whose argument is the standard DAP2 constraint. Alternatively, one could define a collection of nested functions to do the same thing.
 
An important issue involves the construction of a DDX from a constraint. I have begun this discussion [[DAP4: Constructing a DDX from a Query | here ]]
 
I hypothesize that Ferret notations coud be represented in my proposed function notation without having to clutter up the URL format. Consider this Ferret expression.
: <nowiki>http://.../thredds/dodsC/hfrnet/agg/6km_expr_{}{let deq1ubar=u[d=1,l=1:24@ave]}</nowiki>
 
A possible equivalent (assuming I understand the Ferret expression) might look like this.
: <nowiki>?avg(u[1,1:24])</nowiki>


An important aspect has to do with the construction of what may be referred to as a DATADDX. It defines the structure of a DDX that is the composition of the return types of the invoked functions that will return a (possibly structured) value. I need to work this out. BUT, in any case, the resulting DATADDX may have only have a loose relation to any DDX representing the raw dataset.  This is because server-side computations will not have been represented in the original DDX, but only in the DATADDX.


''-Dennis Heimbigner''
''-Dennis Heimbigner''

Latest revision as of 19:48, 13 April 2012

<< Back to OPULS Development

Background

Looking to the future, it is clear that eventually our query language, or more generically our server commands must encompass three classes of computations.

  1. Queries in the DAP2 sense,
  2. Commands to control the processing of requests on the server (i.e. thing like caching),
  3. Server-side processing.

I want to propose a notation for everything in the URL after the "?". I think this notation has the ability to represent a wide variety of features without, I hope, being too generic.

Proposal

The notation is basically nested functions combined with single assignment variables. A semantically nonsensical, but grammatical example would look something like this.

?svc("cmd");$x=f("string17",g(h(12))),f2($x,[0:3:10])

Everything past the "?" is in the form of a semi-colon separated list of expression lists. An expression list is a comma separated list of nested function invocations, possibly assigned to a variable. Anything that begins with a dollar sign is considered a local, temporary, variable, anything that does not look like a function call (i.e. name followed immediately by left paren) is assume to be what I will call a non-quoted string constant; standard quoted string constants are also allowed. Each function has an arbitrary number of argument expressions separated by commas.

BTW, the term "single assignment" means that a variable may only be assigned to once, but may be referenced as many times as desired after that.

One important issue involves providing namespaces for function names. That is, there will be standard pre-defined functions, server-specific functions, and even dynamically defined functions. In order to define a namespace, it is necessary to provide some kind of marker in function names. I have chosen the Java fully qualified name model in which the marker is the dot character.

The idea is that any function name has a fully qualified name specified using dot separators. Names that have no dots (modulo import below) are assumed to be in the pre-defined standard namespace.

The management of that namespace tree must be determined, with some mechanisms for allowing others to assign functions in the namespace. Also, some form of import mechanism ala Java would be desirable as a separate server command.


Syntactic and Lexical Structure

I have defined a preliminary syntax document and lexical document for the ideas presented here.

Discussion

The purpose of providing multiple expression lists separated by semi-colons is to support the a visible notation for specififying server commands separate from constraint evaluation.

My hypothesis is that this notation should also be able to handle most kinds of server side processing by defining and composing functions.

The standard projection+selection constraints of DAP2 can be represented using a special constraint() function whose argument is the standard DAP2 constraint. Alternatively, one could define a collection of nested functions to do the same thing.

An important issue involves the construction of a DDX from a constraint. I have begun this discussion here

I hypothesize that Ferret notations coud be represented in my proposed function notation without having to clutter up the URL format. Consider this Ferret expression.

http://.../thredds/dodsC/hfrnet/agg/6km_expr_{}{let deq1ubar=u[d=1,l=1:24@ave]}

A possible equivalent (assuming I understand the Ferret expression) might look like this.

?avg(u[1,1:24])


-Dennis Heimbigner