Issue 295: Digital Libraries as physical objects

Starting Date: 
2016-02-07
Working Group: 
3
Status: 
Done
Closing Date: 
2018-01-18
Background: 

Voting for the label of E78 from Collection to Curated Holding the following emals have been exchange:

posted by Franco  on 6/2/2016

Not a vote, but an amateur comment.

-> Collection vs Curated Holdings
Of course. There are “collections" of things which are not curated, just an assemblage of stuff. Like some old objects I have which I call my collection of old computational instruments, but according to the qualified opinion of my wife is just a dust attractor.

But, what about a collection (oops, curated holdings) of videogames like the one of the National Videogame Museum of Frisco, TX, USA (https://en.wikipedia.org/wiki/Videogame_History_Museum)? It seems that what they collect (oops, “curate and hold”) forms a collection of E28 Conceptual Object rather than E18 Physical Thing.

Possibly defining E78 as a curated assemblage of instances of E72 Legal Object or, maybe even better, of E70 Thing (in some cases rights may be difficult to ascertain as in the examples below), would reconcile the CRM with the curators of the Frisco museum; and also simplify life to the curators of the Museo Officina Profumo from the Old Pharmacy of Santa Maria Novella in Florence, which I hope you will be able to visit when you come here for the next CRM meeting, as well as to their colleagues of the Musèe International du Perfum of Grasse, France. Is fragrance a Physical Thing? What about the exhibitions for visually impaired people consisting in a garden where they smell the scent of different flowers along the visit?

-> Holding vs holdings
My knowledge of English is even more superficial than my knowledge of the CRM, so my opinion is not so qualified. It seems to me that “holdings” is used here in the plural as a collective noun, and not as the result of putting together one individual “holding" with another one and another one and so on, as a genuine plural; it is unfortunately written in the plural, what makes pluralizing it a bit awkward if one wants to keep the “holdings" distinct from each other, as Werner has pointed out. Not only: it makes difficult to express indefiniteness as when using the indefinite article “a” to indicate “one of a series", like in the second sentence of this email where I was in trouble being unable to call it “a curated holdings” as “a” cannot go with “holdings”. So what would be the correct way of expressing the equivalent of “a collection” i.e. one instance whatever of the class “curated holdings”? Wouldn't the sentence "Is_A curated_holdings” descri
bing a future subclass of E78 sound strange?

Finally, holdings is a synonym of property, according to my Oxford dictionary, which is not the case if the objects forming the collection (oops, holdings) are just deposited, on lean, or illegally/controversially detained. There are famous examples of the latter.

So I would prefer “curated set” or better “curated assemblage”, being “set" a very generic term not incorporating the concept of intentionality.

Anyway, this concern about a name is not so important: that which we call a rose, by any other name would smell as sweet. More important is, in my opinion, the issue concerning E18 vs E28 => E72/E70 to characterize the components of E78, as noted above.


Posted by Martin on 6/2/2016

As an explanation what has been discussed in the meeting:
Librarians talk about "holdings" wrt to library contents.
It means the physical copies. Therefore they are physical. So, there is a good practice of the term there,
which actually motivated the proposal.

The argument why we have modelled collections as physical things, regardless the intentional content:
The video games, in order to be in a collection, must be represented by physical copies.
If we would disregard the physical nature of the copy, we could not talk about location and destruction.
This is also the sense how libraries distinguish holdings from content.

Would that make sense?


posted by Franco on 7/2/2016

very clear, thanks. Just a bit nineteen-century-ish: what about digital libraries and digital curation, which does not concern curating the servers (or the Cloud) on which their instances of E28 reside?
It seems that they are beyond the (current) scope of the CRM, what sounds a bit paradoxical, but nevertheless perfectly logical. 

 


Posted by Martin on 7/2/2016

May be I confused you: by physical copies in the CRM we do not mean on paper. We mean any material
carrier, which implies computer discs, albeit somewhere in the Cloud only the provider knows (but someone
must know it).

Badly enough, this is not in the collection scope note. It is only implicit in the information carrier.

I propose to adapt the scope note of E78. The issue was modelled in more detail in FRBRoo, but not transferred
back into the CRM.

Would we agree now ?

Martin

E84 Information Carrier

Subclass of:         E22 Man-Made Object

Scope note:         This class comprises all instances of E22 Man-Made Object that are explicitly designed to act as persistent physical carriers for instances of E73 Information Object.

An E84 Information Carrier may or may not contain information, e.g., a diskette. Note that any E18 Physical Thing may carry information, such as an E34 Inscription. However, unless it was specifically designed for this purpose, it is not an Information Carrier. Therefore the property P128 carries (is carried by) applies to E18 Physical Thing in general.

Examples:          

§  the Rosetta Stone

§  my paperback copy of Crime & Punishment

§  the computer disk at ICS-FORTH that stores the canonical Definition of the CIDOC CRM

Current Proposal: 

In the 39th joined meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9 and the 32nd FRBR - CIDOC CRM Harmonization meeting,  the sig, following Martin’s proposal to remove class E84 since it does not satisfy the requirements expressed on issue 340,  proposed the examples of material carrier of a digital object to be moved to E24 of an E25 digital feature and possibly to E78 οr put example for E78 of Server holding Digital Asset Management.

Finally, the sig asked Martin to make an example. The issue will be complete with examples. It is decided to be created a new issue for covering the discussion about  E84 staying or going.

Heraklion, October 2017

 

Posted by Martin on 4/1/2018

<HW>

Delete:
E84 Information Carrier

Subclass of:        E22 Man-Made Object

Scope note:       This class comprises all instances of E22 Man-Made Object that are explicitly designed to act as persistent physical carriers for instances of E73 Information Object.

An E84 Information Carrier may or may not contain information, e.g., a diskette. Note that any E18 Physical Thing may carry information, such as an E34 Inscription. However, unless it was specifically designed for this purpose, it is not an Information Carrier. Therefore the property P128 carries (is carried by) applies to E18 Physical Thing in general.

Examples:        

§  the Rosetta Stone

§  my paperback copy of Crime & Punishment

§  the computer disk at ICS-FORTH that stores the canonical Definition of the CIDOC CRM

 

In First Order Logic:

                           E84(x) ⊃ E22(x)

New examples in:

E78 Curated Holding

Subclass of:         E24 Physical Man-Made Thing

 

Scope note:          This class comprises aggregations of instances of E18 Physical Thing that are assembled and maintained (“curated” and “preserved,” in museological terminology) by one or more instances of E39 Actor over time for a specific purpose and audience, and according to a particular collection development plan.  Typical instances of curated holdings are museum collections, archives, library holdings and digital libraries. A digital library is regarded as an instance of E18 Physical Thing because it requires keeping physical carriers of the electronic content.

 

Items may be added or removed from an E78 Curated Holding in pursuit of this plan. This class should not be confused with the E39 Actor maintaining the E78 Curated Holding often referred to with the name of the E78 Curated Holding (e.g. “The Wallace Collection decided…”).

 

Collective objects in the general sense, like a tomb full of gifts, a folder with stamps or a set of chessmen, should be documented as instances of E19 Physical Object, and not as instances of E78 Curated Holding. This is because they form wholes either because they are physically bound together or because they are kept together for their functionality.

 

Examples:         

§  the John Clayton Herbarium

§  the Wallace Collection

§  Mikael Heggelund Foslie’s coralline red algae Herbarium at Museum of Natural History and Archaeology, Trondheim, Norway

§  The Digital Collections of the Munich DigitiZation Center (MDZ) accessible via https://www.digitale-sammlungen.de/ at least in January 2018.

In First Order Logic:

                           E78(x) ⊃ E24(x)

E24 Physical Man-Made Thing

Subclass of:         E18 Physical Thing

                           E71 Man-Made Thing

Superclass of:      E22 Man-Made Object

E25 Man-Made Feature

E78 Collection

 

Scope Note:        This class comprises all persistent physical items that are purposely created by human activity.

 

This class comprises man-made objects, such as a swords, and man-made features, such as rock art. No assumptions are made as to the extent of modification required to justify regarding an object as man-made. For example, a “cup and ring” carving on bedrock is regarded as instance of E24 Physical Man-Made Thing.

Examples:         

§  the Forth Railway Bridge (E22)

§  the Channel Tunnel (E25)

§  the Historical Collection of the Museum Benaki in Athens (E78)

§  the Rosetta Stone (E22)

§  my paperback copy of Crime & Punishment (E22)

§  the computer disk at ICS-FORTH that stores the canonical Definition of the CIDOC CRM (E22)

§  my empty DVD disk (E22)

 

In First Order Logic:

                           E24(x) ⊃ E18(x)

                           E24(x) ⊃ E71(x)

Properties:

P62 depicts (is depicted by): E1 CRM Entity

(P62.1 mode of depiction: E55 Type)

P65 shows visual item (is shown by): E36 Visual Item

Scope Note extension:

E25 Man-Made Feature

Subclass of:         E24 Physical Man-Made Thing

E26 Physical Feature

 

Scope Note:        This class comprises physical features that are purposely created by human activity, such as scratches, artificial caves, artificial water channels, etc. In particular it includes the information encoding features on mechanical or digital carriers.

 

No assumptions are made as to the extent of modification required to justify regarding a feature as man-made. For example, rock art or even “cup and ring” carvings on bedrock a regarded as types of E25 Man-Made Feature.

Examples:         

§  the Manchester Ship Canal

§  Michael Jackson’s nose following plastic surgery

§  The laser-readable “pits” engraved June 2014 in my CD-R, copying songs of Edith Piaf’s.

§  The carved letters on the Rosetta Stone

In First Order Logic:

                           E25(x) ⊃ E26(x)

                            E25(x) ⊃ E24(x)

posted by Øyvind on 8/1/2018

<COMMENT> <PROPOSAL>

it looks good to me. Given the deprecation of E84 it could be nice to add a sentence to the scope note of E24, something like:
 
”Instances of this class can act as carriers of instances of E73 Information Object.”

Posted by Steve on 8/1/2018

<COMMENT>

I agree that would be helpful.

 

Posted by Thanasis on 10/1/2018

<PROPOSAL

§  The Digital Collections of the Munich DigitiZation Center (MDZ) accessible via https://www.digitale-sammlungen.de/ at least in January 2018.

be instead:

§  The group of servers (hardware) holding the Digital Collections of the Munich DigitiZation Center (MDZ) accessible via https://www.digitale-sammlungen.de/ at least in January 2018.

The term "Digital Collections" will not necessarily mean a physical thing for many readers. 

posted by Martin on 10/1/2018

<COMMENT>

Dear Thanasi,

On 1/10/2018 1:30 PM, Athanasios Velios wrote:
> Shouldn't this:
>
> §  The Digital Collections of the Munich DigitiZation Center (MDZ) accessible via https://www.digitale-sammlungen.de/ at least in January 2018.
>
> be instead:
>
> §  The group of servers (hardware) holding the Digital Collections of the Munich DigitiZation Center (MDZ) accessible via https://www.digitale-sammlungen.de/ at least in January 2018.
>
> The term "Digital Collections" will not necessarily mean a physical thing for many readers.
Actually we do not mean the servers as a whole, but only the material signal encoding on the media. This interpretation gives correct answers that the collection can be destroyed, and is a "holding" in the hands of the maintainers, i.e., physically kept, and that it can change like a physical thing loosing its previous form.
The immaterial item would not change, reside on multiple carriers. An update would create a new derivative, i.e., another thing, not affecting other copies around.
The material interpretation is problematic if the content is moved around servers.

Another interpretation is that of a "volatile dataset" we at FORTH used in the PARTHENOS project, which uses the logical condition that there is only one representative version of the data object at any point in time, regardless carrier. It updates like a material object. This may in general create a problem, if the authority identifying the correct representative version not clear. I tried to be neutral to this dilemma by using the URL, which points to the physical "location", under which the representative version will appear, and makes the storage system an internal issue of the maintainer.

Consider a "move" of the database to another storage system and a simultaneous update. Then, formally, neither the carrier nor the content is the same, but it is still the same "digital library".

Note, that if I make a copy of a digital library, I get an immaterial object, which will not be representative after the first change to the original, without me doing anything. Hence, the digital library does not behave like an Information Object in the sense of the CRM. 

posted by Franco  on 10/1/2018

<COMMENT>

Quoting Martin below

[By Digital Collections] ... we do not mean the servers as a whole, but only the material signal encoding on the media.

This statement is an oxymoron. Whatever material thing cannot be digital, not even “signals”: according to my Oxford Dictionary, digital means "expressed as a series of the digits 0 and 1". In a collection, whatever it is, you just get more 0’s and 1’s but no material thing.

Thanasis is right as regards deprecating the use of the expression “Digital Collections”. This term does not mean a material thing also for the authors of the Oxford Dictionary, besides the many readers he mentions that include myself.

I may agree that the “encoding on the media” consists in (perhaps temporary and reversible) alterations of the media itself, possibly with only two different states eg black/white, positive/negative, etc, to encode the content according to a predefined code; and recorded there magnetically, optically or carved (the Code of Hammurabi kept at the Louvre, unfortunately not with a binary code); in any case altering (some property of) the media itself. It could also be Martin Doerr’s voice, analogically recorded on vinyl  on 10/01/2018 from 21:48 to 22:30 while reading the Code of Hammurabi in Akkadian (with a nice voice but with a terrible German accent, unfortunately) .

So, thumbs down for "digital collections”.
 

posted by Daria Hookk on 10/1/2018

<COMMENT>

QR-code is very phisical (on surface) and absolutely digital, because presents 0 & 1.


 

posted by Franco on 11/1/2018

<COMMENT>

Not really, Daria. It is not digital, it possibly represents/is the support of a digital encoding, same as a selfie of me on my iphone is not my face.

posted by Franco on 11/1/2018

All these examples show that the issue exist!

My opinion in short: there is of course a distinction between “hard” and “soft” copies. Hard (i.e. material) copies involve modifying matter; soft (i.e. immaterial) ones don’t. Hard copies are affected by degradation, soft ones don’t. Soft copies may be digital (e.g. music on a cd or on a hard disk) or analog (e.g. same music on vinyl) or ... (same music transcribed on music paper); hard copies are what they are.
Association of soft stuff with the hard copy is rather subjective: BA BA BA BAAAAA may correspond to the beginning of Beethoven’s 5th symphony as well as Herbert von Karajan’s & Berliner Philarmoniker Orchestra digital version now playing on my Mac. There may be a “canonical” association between the soft stuff I receive and perceive, and a (master) hard version, e.g. between the 5th symphony and Beethoven’s original manuscript kept at Staatsbibliotek Berlin. I think most of the above is addressed in FRBR and CRM uses a simplification to deal with immaterial content, as it is considered to be borderline within its scope. But sometimes (this may not be the case) oversimplification turns into confusion.

posted by Achille on 11/1/2018

Dear Franco,

> Il giorno 10 gen 2018, alle ore 21:52, Franco Niccolucci <franco.niccolucci@gmail.com> ha scritto:
>
> Quoting Martin below
>
> [By Digital Collections] ... we do not mean the servers as a whole, but only the material signal encoding on the media.
>
> This statement is an oxymoron. Whatever material thing cannot be digital, not even “signals”: according to my Oxford Dictionary, digital means "expressed as a series of the digits 0 and 1". In a collection, whatever it is, you just get more 0’s and 1’s but no material thing.

For completeness it should also be noted that the Oxford Dictionary goes on to explain that the 0 and 1 digits are: “typically represented by values of a physical quantity such as voltage or magnetic polarization”, which seems, in some way, to refer to some kind of “physicality” still present “in the background” 

posted by Franco on 11/1/2018

Thanks Achille.

That sentence about 0s and 1s is there probably because people, and especially humanists like dictionary editors, don’t understand the nature of numbers.

The number “two” is the number two, not two cows, two oranges or two humans. Its definition does not need any physical representation and even abstracts from any conceptual way of representing it, i.e. with a binary system (0s and 1s) or sexagesimal one. Actually in most cases, and in most people’s minds, two is 1 and 2, not 0 and 1, which comes in because the computer representation uses a flip-flop circuit.

This is very clear from Martin’s distinction quoted in some previous email between Maxwells’ equations and the way they are formally represented, and then printed in a book. So there are three levels: the concept, the conceptual representation, and its physical footprint. Of these, two are described by the CRM, the intermediate one being probably considered as irrelevant.
 

posted by Achille on 11/1/2018

I fully support you threefold view of the digital world, but I have also the impression, in the case of a digital collection, that the particular situation under which such an entity exists, needs some specific conceptualisation. In particular, my question is: in which of the three levels you mentioned are we acting while talking about a “digital collection”?

If we consider the conceptual level, we have to observe that, according with your view, it should be constituted by the sequence of 0-1 digits necessary to express it; but this sequence does not exists “a priori” at an archetypical level (like the conceptual number “two” in your example) in our mind; it only starts its existence once the circuits have made their job and the electrical sequence is created. Only at this point I can describe it and express it with all the 0 & 1 digits resulting from this (physical) process. So it is an “a posteriori” knowledge inferred after the physical process.

Another question concerns the stability of the digits and the order they are arranged in: does the modification of this arrangement affects the nature of my digital object?

Again, I have some difficulties in ascribing a digital object to the pure level of the abstracts or representative things …

A.

posted by Franco on 11/1/2018

Dear Achille

my position is to deprecate the use of the term “digital collection” to precisely designate whatever, as it is ambiguous (agree with Thanis).

If used colloquially, it is an acceptable term to designate the assembly of various digital objects.

In no case it can refer to or exemplify an (instance of) E28 Conceptual Object, which is not digital; nor to an E84 Information Carrier, now subsumed by the mother class E24 Physical Man-Made Thing, because they are (were) "persistent physical carriers for instances of E73 Information Object”, thus they can’t be digital.

The proposed example for E78, subclass of E24

"The Digital Collections of the Munich DigitiZation Center (MDZ) accessible via https://www.digitale-sammlungen.de/ at least in January 2018.

is not wrong per se, as here the term may designate the hard disk (portion) where such “collections” are stored, but might be misleading and confusing, as it commonly and colloquially refers to the content (E78) of the said hard disk, and not to the physical storage. And if it is a hard disk, it is not digital, it is hard and possibly stores digital information; but may also be a pencil sharpener, as you well know, if you attach abrasive paper on top of it; or an improvised weapon, if you get angry and want to kill your office mates. 

posted by Christian-Emil on 11/1/2018

The number two is what all sets with two elements have in common or according to Gottlob Frege the number two is to count to two etc etc.

Most dictionaries I have checked focus on the difference between digital as discrete signals and analog as continuous signals. I think this will change since digital already has a tendency to denote something connected to computer/non-analog electronic gadgets/devices.

After creating and developing what has been called digital collections the last 25 years and working together with the scholars curating such beasts, my observation is that a digital collection is very similar to the traditional "physical" collections. There are of course some differences. You cannot really store a stuffed mammoth in a computer system without destroying the computer system.  You may store a digital image depicting it. A digital collection is a collection of data (maybe information but let us drop that debate here) which mostly could have been stored on paper, magnetic tape etc, but in the case of a digital collection it will be stored in a computer system. One usually don’t care about the representation level (bit direction, sound waves in mercury (Turing) etc.).

A (digital) collection maybe copied and published as a finished unit. My copy will not be the collection. It will be a copy (of the content) of the collection at a given point in time.  It is definitely a physical thing.

To make the discussion more complex: The curatorial aspect is also important when using the word collection. A collection can be an actor + a physical (data) set + the activity of curating the (data) set.

In the fourth example on could put the word Digital in parenthesis or delete it:
The (Digital) Collections of the Munich DigitiZation Center (MDZ) accessible via https://www.digitale-sammlungen.de/ at least in January 2018

posted by Martin on 11/1/2018

Dear Franco,

I'd like to clarify: The CRM does not intend to constitute a dictionary of what terms mean in an absolute way. Of course the content of the Digital Library at any point in time is immaterial. No objections to your comments. But it is equally obvious, that any accessible form of it must be a physical feature.

The question we are discussing, is rather which of the involved things is the one that provide an identity condition that corresponds to the
notion of "holding" and "my copy". If I am not mistaken, copyright enforcement may require someone to "delete" a digital file from his system.
If this is true, then, clearly, there exists a sense which is socially and legally functional, of a "digital file" as a physical feature.

The next question is, if the concept of "holding" a digital library, pertains to the physical or the immaterial nature. If a library has a book,
we always mean a physical copy, even though the content of the book is not theirs. In that sense, the digital and the traditional information form
makes no difference.

Classes in the CRM are exclusively defined in a way which represents the phenomena of reality in a way so that the relations between them are well defined, or better "confined" to the domain and range.
The relationship of "holding" is one of these relations. Hence the question is, what of the involved phenomena is the relevant one for "holding", the material, or the immaterial, or another one, and not, if "digital" means physical or immaterial out of context.

It appears to me, that the important new quality of "digital" is NOT at all that it is a new information form or that it is "virtual". At the first glance, it is just another materialization of information. What makes the difference, is the ease and speed by which we can move it from one carrier to another. This changes practices of holding. Whereas a library would not produce books, it may quite well now "reproduce" (fotocopy etc) books from their holdings. They may not be allowed to reproduce from other's holdings. Digital libraries will send copies over electronic communication.  The Munich Digital Library explicity refers to analog holdings they have digitized.

Now, for the digital library as a whole, undergoing intended changes, what is the relevant identity condition, and how to model that in an easy way? I tried to point out, that both, a concept of a "volatile information object" and describing the materialization has its pros and cons.

Would that make sense?:-)
 

posted by Martin on 11/1/2018

Dear All,

I am fascinated by your enthousiasm and good comments, BUT if anyone of you had opened the link,

https://www.digitale-sammlungen.de/index.html?&l=en

you would have seen, that they use the term "Digital Collections", and not me!

So, I improve:

* The "Digital Collections" of the "Munich DigitiZation Center (MDZ)" accessible via https://www.digitale-sammlungen.de/ at least in January 2018.

and I insist NOT to use any other term in that example than the real life one...

posted by Franco on 11/1/2018

Dear Martin

I do not want to bore the SIG members any more, but your argument serves my cause, and not yours.

Let me remind that the discussion is about an example of (E78 Curated Holding which is a subclass of)) E24 Physical Man-Made Thing, i.e. the hard disks or whatever is carrying at MDZ the E73 Information Object. Undoubtedly E78 inherits the physicality of the superclass E24.
The new E78 example says: 'The "Digital Collections" of the "Munich DigitiZation Center (MDZ)" accessible via https://www.digitale-sammlungen.de/ at least in January 2018.'

On the same MDZ web page, under the heading “Digital Collections”, MDZ offers services to Search Collections, Explore the Collections or thematic groupings. Do they mean they have developed a way to enable me to walk the hard disks bit by bit (i.e. explore the E24/E78) as the example would imply, or rather the information of which they are carriers (i.e. E73)?

It is probably the second. MDZ and its users is very little interested in hard disks (E78) and more in the content they carry (E73). Then, their use of “Digital Collection” refers to a grouping of E73 instances rather than to hardware. Apparently, Martin does the opposite.

But, I surrender, I do not want to become a hater in the SIG community nor see my messages automatically redirected to the SPAM folder.

Unfortunately I can’t attend the Cologne meeting, where we could have continued this passionate discussion, possibly adding other very interesting themes like "How many angels can dance on the head of a pin”. By the way, what is an angel? An E38 actor, an E90 Symbolic Object, an AAAAARRGH

posted by Christian Emil on 12/1/2018

Dear all,
I opened the link and as I mentioned yesterday the term "digital collection" is frequently used to denote computerized collection like MDZ. So even if OED and other dictionaries are somewhat dated, we safely can interpret the term the way used in MDZ.  I completely agree with Martin

Our discussion touches  the topic what is a model  what is modelled, cf. the useless  term 'digital surrogate' used  in the EDM context.
 

Outcome: 

In the 40th joined meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9 and the 33nd FRBR - CIDOC CRM Harmonization meeting,the sig reviewed MD’s HW and decided the following:

  • delete E84 information carrier
  • E78 Curated Holding: New examples have been added
  • E24 Physical Man-Made Thing
  • Changes in scope note 
  • Examples moved from E84 to E24
  • Also we should look for example of well known some sort of information bearing object that does not have information on it. E.g. empty blackboard. This is HW for MD
  • E25 Man-Made Feature: scope note extension and two examples have been added

The text of the discussion is appeared here

The issue is closed

Cologne, January 2018

Reference to Issues: