Issue 397: Dimension Intervals

Starting Date: 
2018-11-14
Working Group: 
3
Status: 
Open
Background: 

Posted by Martin on 7/11/2018

 

Continuing issue 363,

I propose the following:

"Whereas the CRM regards that intervals of primitive values are primitive values by themselves, there is currently no corresponding practice in RDF. Therefore, in analogy to the properties of E52 Time-Span, we define in CRM RDFS two more subproperties of P90 has value: “P90a_has_lower_value_limit” and “P90b_has_upper_value_limit”. The precise guidelines for using these properties are to be given."

Sensor arrays, more and more in use, pose the issue of a single measurement resulting in an array of numbers which altogether form one quantitative statement about the observed. We can describe such structures easily as one complex type of unit (and define an IRI for it), and then regard the value to a matrix of numbers, in which each position obeys subunits as defined in the complex unit type.

Even if we regard complex matrices of numbers as one value for an instance of E54 Dimension, such as RGB image, we can argue that minimal and maximal values exist as two separate matrices of the same structure.

Consequently I propose to deprecate P83, P84, because in competes with an interval interpretation of P90, and :

Introduce instead Pxxx had duration, Domain:  E52 Time-Span, Range: E54 Dimension
and use the P90, P90a, P90b as adequate

or introduce  an Exxx Temporal Duration , subclass of E54 Dimension, and define subproperties in RDFS ending in xsd:duration.

See:

P83 had at least duration (was minimum duration of)

 

Domain:              E52 Time-Span

Range:                E54 Dimension

Quantification:    one to one (1,1:1,1)

 

Scope note:         This property describes the minimum length of time covered by an E52 Time-Span.

 

It allows an E52 Time-Span to be associated with an E54 Dimension representing it’s minimum duration (i.e. it’s inner boundary) independent from the actual beginning and end.

Examples:        

§  the time span of the Battle of Issos 333 B.C.E. (E52) had at least duration Battle of Issos minimum duration (E54) has unit (P91) day (E58) has value (P90) 1 (E60)

 

In First Order Logic:

                           P83(x,y) ⊃ E52(x)

                           P83(x,y) ⊃ E54(y)


P84 had at most duration (was maximum duration of)

Domain:              E52 Time-Span

Range:                E54 Dimension

Quantification:   one to one (1,1:1,1)

Scope note:         This property describes the maximum length of time covered by an E52 Time-Span.

It allows an E52 Time-Span to be associated with an E54 Dimension representing it’s maximum duration (i.e. it’s outer boundary) independent from the actual beginning and end.

Examples:        

§  the time span of the Battle of Issos 333 B.C.E. (E52) had at most duration Battle of Issos maximum duration (E54) has unit (P91) day (E58) has value (P90) 2 (E60)

In First Order Logic:

                           P84(x,y) ⊃ E52(x)

                           P84(x,y) ⊃ E54(y)

Current Proposal: 

Posted by Richard Light on 8/11/2018

While we're looking at this area, I would be grateful if we could also look at Value and Unit.

I have never understood how P90 and P91 are actually meant to be used together. I can see how a single E54 can be represented by a single P90 and a single P91, but how do we represent anything more complex?  An example would be "3 ft 6 inches".  Can that be an E54 Dimension, and if so how do you know which unit applies to which value?

Posted by Robert Sanderson on 8/11/2018

+1 to this issue.

This also happens a lot with MonetaryAmounts.  4 shillings and 6 pence is not 4.6 of any one currency.

Posted by Martin on 8/11/2018

Dear Richard,

It requires a sort of datatype or encoding.

Assume unit = "ft&inches"
               value = <3,6>

would that make sense?

In the xsd datatypes everything is in the value already.

Posted by Franco on 9/11/2018

Martin,

I agree with you, E60 Number is a jack-of-all-trades and can be a couple, a triple, whatever numeric value or set of values as long as it is clear what is what.

So for ancient/nonstandard/local units such as ft & inches or Roman cubitus I would add:

E58 Measurement Unit “ft&inches” P70 is documented in E31 Document “F.W Clarke, Weights Measures and Money of all Nations. Appleton & C. New York 1888”.

Incidentally, Prof. Clarke (from the U. of Cincinnati) wrote in the introduction “Our three sets of weights, our three different gallons, and our two dissimilar bushels, all unrelated to each other, or to the units of length, must soon give way before the simplicity and elegance of the metric system. That this event my soon happen [...] is the sincere wish and hope of the writer.” 130 years have passed since then, at no avail.

Thus, I would at least regard any such unit (system) as local or historical, and therefore needing a reference description: otherwise for me - and for any scientist - that value of 3 ft 6 inches could equally well be the distance of Alpha Centauri from the Earth, or the size of a bacterium.

Posted by Richard Light on 9/11/2018

On 08/11/2018 20:00, Martin Doerr wrote:

> Dear Richard,
>
> It requires a sort of datatype or encoding.
>
> Assume unit = "ft&inches"
>                value = <3,6>
>
> would that make sense?
>
> In the xsd datatypes everything is in the value already.

The XSD datatypes all resolve to single values, so don't give a clear steer from them as to how to deal with the 'multiple units' issue.

I can see what you're saying as regards a 'complex' datatype, but I can't find examples on the Web of how the value would actually be encoded as an RDF value which software agents could do anything useful with.

The best I have come up with is this document from 2002:

http://infolab.stanford.edu/~melnik/rdf/datatyping/

which has some heavy hitters associated with it.  Is this the sort of approach you are proposing?

A slightly more complex example would be a geographical coordinate expressed as latitude and longitude (both expressed as degrees, minutes and seconds).

Posted by Martin on 10/11/2018

Dear Richard, All,

I think we need some expert in the respective kinds of syntax. I hope there is someone on this list working more at the programming level. I am no more working at the programming level.
I believe from a general point of view, it is a non-issue. I regard this not a question of feasibility, but getting an IT guy trained in this. I hope someone on this list knows or knows who knows

The example "feet/inches" and "degrees, minutes and seconds" is mathematically exactly the same as a date composed of "Year/month/day", even more simple, because their are no leap-years etc. So, since the one works, the others must work as well, in analogy.

The geometric primitives, for instance WKT strings, describe points and volumes in 3- or even 4-dimensional spaces. Since this works, any n-dimensional value can be represented in the same way.

A simple way is this:
In a literal, we can store any XML or JASON chunk, and represent the schema as "unit".

Of course, we need to spell this out

In the 42nd joined meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9 and the 35th FRBR - CIDOC CRM Harmonization meeting, after discussing MD's proposal, the crm-sig decided in favor of introducing a new property Pxx had duration linking an instance of 52 Time-Span to an instance of E54 Dimension. MD was assigned with writing the scope note for that.

Also the sig decided to accept the proposed subproperties of Pxx had duration in CRM-RDFS . These are: P90a_has_lower_value_limit  and P90b_has_upper_value_limit . Consequently, the sig decided to deprecate P83 had at least duration (was minimum duration of), P84 had at most duration (was maximum duration of), because they compete with an interval interpretation of P90.

The domain of new Pxx had duration should be the E52 Time-Span and its range should be E54 Dimension. Migration paths from the deprecated properties are to be made explicit.


The alternative proposal of introducing a class Exx Temporal Duration, such that it is a subclass of E54 Dimension, and define properties it will participate in, was rejected by the crm-sig.
The idea is that duration, defined by time intervals, can be treated as a kind of dimension, and thus be defined by its inner and outer limits -corresponding to the lower and upper value, respectively. Deprecating the specific properties for temporal duration (P83/P84) in favor of a more generically applied set of properties will help increase the consistency of the model.
The alternative proposal by RL, that the said property be called 'has dimension' was not further discussed.

Berlin, November 2018

Posted by Martin 15/2/2019

Dear All


As discussed in Berlin, I proposed to deprecate P83, P84, because in competes with an interval interpretation of P90, and :

Introduce instead Pxxx had duration, Domain:  E52 Time-Span, Range: E54 Dimension
and use the P90, P90a, P90b as adequate or introduce  an Exxx Temporal Duration , subclass of E54 Dimension, and define subproperties in RDFS ending in xsd:duration.


Here my definition:


Pxxx had duration (was duration of)

Domain:              E52 Time-Span

Range:                E54 Dimension

Quantification:    one to one (1,1:1,1)

Scope note:         This property describes the length of time covered by an E52 Time-Span. It allows an E52 Time-Span to be associated with an E54 Dimension representing duration (i.e. it’s inner boundary) independent from the actual beginning and end. Indeterminacy of the duration value can be expressed by assigning a numerical interval to the property P90 has value of E54 Dimension.

Examples:       

§  the time span of the Battle of Issos 333 B.C.E. (E52) had duration Battle of Issos minimum duration (E54) has unit (P91) day (E58) has value (P90) (E60)

In First Order Logic:

                           Pxxx(x,y) ⊃ E52(x)

                           Pxxx(x,y) ⊃ E54(y)

Comments?

------------------------------------------------------------------------------------------------------

See:

P83 had at least duration (was minimum duration of)

Domain:              E52 Time-Span

Range:                E54 Dimension

Quantification:    one to one (1,1:1,1)

Scope note:         This property describes the minimum length of time covered by an E52 Time-Span.

It allows an E52 Time-Span to be associated with an E54 Dimension representing it’s minimum duration (i.e. it’s inner boundary) independent from the actual beginning and end.

Examples:       

§  the time span of the Battle of Issos 333 B.C.E. (E52) had at least duration Battle of Issos minimum duration (E54) has unit (P91) day (E58) has value (P90) 1 (E60)

In First Order Logic:

                           P83(x,y) ⊃ E52(x)

                           P83(x,y) ⊃ E54(y)


P84 had at most duration (was maximum duration of)

Domain:              E52 Time-Span

Range:                E54 Dimension

Quantification:   one to one (1,1:1,1)

Scope note:         This property describes the maximum length of time covered by an E52 Time-Span.

It allows an E52 Time-Span to be associated with an E54 Dimension representing it’s maximum duration (i.e. it’s outer boundary) independent from the actual beginning and end.

Examples:       

§  the time span of the Battle of Issos 333 B.C.E. (E52) had at most duration Battle of Issos maximum duration (E54) has unit (P91) day (E58) has value (P90) 2 (E60)

In First Order Logic:

                           P84(x,y) ⊃ E52(x)

                           P84(x,y) ⊃ E54(y)

 

Posted by Robert Sanderson on 23/2/2019

This becomes problematic, unfortunately, in RDF which does not have a way to natively express a Number that is actually an interval.  The resolution would be to do the same as P81a/b … which would have the same effect as maintaining P83 and P84, just not in the model directly.

While I appreciate the theoretical consistency that this change would add, from an implementation perspective, this would bring more complexity than value.

Overall, I’m not in favor of the deprecation, but am not averse to adding had_duration separately, with the potential to deprecate 83 and 84 if a holistic approach to date and number intervals can be devised.

 

Thanks!

 

Posted by Martin on 23/2/2019

Dear Robert,

On 2/23/2019 1:09 AM, Robert Sanderson wrote:
>

>
> This becomes problematic, unfortunately, in RDF which does not have a way to natively express a Number that is actually an interval.  The resolution would be to do the same as P81a/b … which would have the same effect as maintaining P83 and P84, just not in the model directly.
>

>
> While I appreciate the theoretical consistency that this change would add, from an implementation perspective, this would bring more complexity than value.

I do not understand what increases the complexity: If I have in RDFS two paths  P83-E54-P90 AND P83-E54-P90, and the ambiguity how to use P90a, P90b together with these paths, OR I have a single path Pxxx-E54 that splits into P90a, P90b, then, in the end I have again two paths: Pxxx-E54-P90a AND Pxxx-E54-P90b and no ambiguity to use P83 or P90a.

So where is the added complexity? I see it only reduced, but I may be wrong!

My second question was if, since we have bound the Dimension already to temporal durations in the definition of Pxxx, we should express that by a subclass of E54.

Best,

Reference to Issues:

Meetings discussed: