rebipp / ppi

REBIPP: Plant-Pollinator Interactions Data Vocabulary

Home Page:https://ppi.rebipp.org.br

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Summarizing discussions about "single visit" and "multiple visits"

zedomel opened this issue · comments

There are some problems with term that defined single visit and multiple visits conditions. We need to specify the scopes where these terms can be applied.

  • What is the relationship of single visit terms with the interaction being documented?
  • What is the relationship of multiple visits terms with the interaction being documented?

Definitions of single visit and multiple visits can be taken from @bjpergamo comment on #64:

Single visit terms
Single visit measurements could be assigned to the interaction between one animal species and one plant species. Usually, such measurements are conducted when one wants to estimate the "effectiveness" of an animal
species as a pollinator of a plant species. In other words, how many pollen grains an animal species deposits in a single visit (and consequently, how many pollen tubes, number of fruits, seeds and so on). Such measurements fit well our database. The problem is that they are quite rare because it is very time consuming to conduct effectiveness approaches. I would say we can keep those terms as they are very adequate for our database and we can create all corresponding "Single visit" terms.

According to this definition measurements of:

  • the number of conspecific pollen grains deposited on the stigma(s) of the flower
  • the number of heterospecific pollen grains deposited on the stigma(s) of the flower
  • pollen tubes quantity
  • number of fertilized ovules
  • number of pollen grains removed from the flower
  • fruit set
  • fruit mass
  • seed set
  • seed mass

all terms apply to a single flower which the Interaction was recorded, and so, we know exactly which flower the definitions are talking about (on the stigma(s) of THE FLOWER, THE FLOWER = the flower which the animal visited). So in that case (single visit) the terms can be added as the Interaction properties (we are documenting a specific interaction).

Multiple visit terms
Multiple visit terms would make more sense if we treat them as plant traits (as we are treating flower color, nectar, etc). This because they cannot be assigned to the interaction between one animal species and one plant species. They are likely the product of multiple interactions (multiple animals and one plant species). I am comparing with the other plant traits because we also do not measure all specific flowers in a plant that interacted with the animal. We measure some flowers in that population to gain insight about average (and variability) of plant traits that we suspect are important in defining plant-pollinator interactions. Similarly, we measure reproductive success in some flowers of that population to gain
insight about average (and variability) of the consequences of plant-pollinator interactions. Depending on the study, interactions, traits and reproductive success may be measured in the same plant individuals (but never in the same flowers) however, it is not always the case.

In this case (multiple visits) all the terms above do not apply to the Interaction being documented. Semantically if we add these terms to the Interaction we will saying that that specific Interaction between an animal occurrence and a plant occurrence yields these measurements (e.g. fruit set, pollen grains deposited), but as @pjbergamo said, it is not what we want to mean here. So, the definitions of these terms in the case of multiple visitors do not have a specific flower, consequently we can not know which flowers the definitions are talking about (THE FLOWER(s) means the flower(s) of specific plant individual or flowers samples from a population (without individual linkage?).
Treating these terms similar as traits will require that we add them as properties of plant occurrence (plant traits). But, those terms are related with the the number of flowers visited which is a property of the Interaction. So let's see some examples:

interactionID (eventID) animal plant numberOfFlowersVisited conspecificPollenGrainsQuantity fruitSet
evt_1 Ceratina asunciana Pfaffia tuberosa 12 200 8

What can we know from the interaction in the example above and the definition of the terms?

  • It is an interaction (with a unique identifier evt_1) between Ceratina asunciana and Pfaffia tuberosa, and the animal(s) visited 12 (distinct?) flowers of the plant. Although, we don't have how to know if these 12 flowers were exposed to multiple visitors or just single visitors. Then, from the actual definitions of:

  • conspecificPollenGrainsQuantity: The number of conspecific pollen grains deposited on the flower's stigma(s) exposed to multiple visitors at the end of flower anthesis

  • fruitSet: Proportion of the flowers exposed to floral visitors that yielded fruits

Both terms are applied to the multiple visits case, and so, we should interpret the example as:

  • The individual(s) represented in the plant occurrence (Pfaffia tuberosa) received a total of 200 pollen grains on an undefined/unknown number of flowers/number of visits. Despite we know that this pollen load is due multiple visits, the numberOfFlowersVisited is a property of the Interaction, and so, we can not assert that the conspecificPollenGrainsQuantity applied to those 12 flowers. It can applies to a different number of flowers which received multiple visits (including other animal species, not documenting in this particular interaction). The same applied to fruitSet, as we don't know if the number of flowers being considered to get the number of mature fruits is the same number of flowers visited by the animal documented in this particular interaction.
    So, putting all this together this record should be interpreted as:
    An interaction (with a unique identifier evt_1) between Ceratina asunciana individual(s) and Pfaffia tuberosa individual(s), where the individual(s) represented in the animal dwc:Occurrence visited 12 flowers of the individual(s) represented in the plant dwc:Occurrence. Also, the plant individual(s) received a total of 200 conspecific pollen grains in an unknown number of flowers, and, an unknown number of flowers set 8 mature fruits.

If this all the information that we are trying to encode here, than I would say that it is fine to treating those terms as traits, despite some confusion that may occurs when using similar terms (e.g cospecificPollenGrainsQuantitySingleVisit vs. cospecificPollenGrainsQuantity) as properties of the plant occurrences (traits) or the interaction. Otherwise, if the encoded information is wrong, then we should review the usage and the definition of these terms.

Somethings that (perhaps) we are missing:

  • A term for the number of flowers where the measurements of pollen grains deposited, pollen tubes, removed pollen grains, etc. are taken
  • A term for the total number of visits which a plant received considering all animal visitors (same and different animal species; interactions). This is a number that can be calculated by summing all the numberOfVisits of all Interactions which a particular individual is participating (all the interactions of the same plant dwc:Occurrence). But what about the case where these two numbers are different? SUM(numberOfVisits) != total number of visits which a plant received? Are all visits always recorded/counted, so we always can use the SUM, instead of have a new term for specify the total number of visits...?

A just created a diagram so we can see better what is going on and discuss more about it:

interactions_TermsReview

coespecificPollenGrainsQuantity

@zedomel wrote on #41 :

multiple visitors implies multiple Interactions, so it is a 1-to-many relationship with Interaction. The term is linked to one or more Interactions or none at all (if the Interaction is not being recorded, just the coespecific pollen grains in exposed flower).

If we treat this term like a trait then this comment does not apply anymore to multiple interactions (1-to-many), it will be just a 1-to-1 with the plant dwc:Occurrence without be linked to any interaction (it will not be able to know which particular interactions are responsible for the pollen load, but we know that all the recorded interactions for the same plant occurrence are partially responsible for that pollen load if all those interactions happened within exposed flowers (flowers that received multiple visits).

@pjbergamo wrote on #41:

I would say that for most cases, none at all.
Single visit and multiple visit are distinct processes. Single visit measurements are concerned on measuring the effectiveness of one pollinator species on a plant species (in other words, how many pollen grains each pollinator that interact with a flower species deposit onto stigmas). This would be a one to one relationship. We currently do not have any term for this situation, and unfortunately this data is extremely rare as it is very time consuming to obtain it.
This term was proposed only for multiple visits situation, as an overall measurement of pollination success of a flower.

From the sentence single visit measurements are concerned on measuring the effectiveness of one pollinator species on a plant species we have that for single visit the animal species responsible to the pollen deposition is important and must be recorded. But from [...] multiple visits situation, as an overall measurement of pollination success of a flower is just about pollination and not pollinator, so I presume that the animal species is not relevant here. Additionally, this particular sentence pollination success of A FLOWER, so I should really consider that as success of A FLOWER or A PLANT? Since the individual is the plant and the amount of pollen deposited on the stigma(s) has impact on the individual's fitness, why recording it at higher granularity (A FLOWER)?
Maybe we already have a solution for that by defining the term to be the total number of conspecific pollen grains, where total applies to all flowers (at least those flowers considered by the measurement) of an individual in a particular interaction.

@cepnunes wrote on #41:

To keep this term comparable among different studies and somehow keep it of general use, I suggest to change the definition to: "The quantity of conspecific pollen grains deposited on the flower's stigma(s) exposed to multiple visitors at the end in a given period of flower anthesis." That would require the addition of another field with the time during which the flower was exposed to visitors or pollinators. This time could or could not be the same of the entire flower anthesis. But it would allow the inclusion of more studies as it would not be restricted only to studies that collected after the end of anthesis.

It eliminates the problem of setting a specific time when the measurement have to be taken. But it does not solve the problem of recoding single visit vs. multiple visits as exposed above.

seedSet

Definition: The total number of seeds of a mature fruit exposed to a single flower visitor

The definition makes clear that is about JUST ONE FRUIT AND JUST ONE FLOWER!. From the explanation above this term should be used as a property of the Interaction (single visit). But since it is about a single fruit (A MATURE FRUIT), it has a 1-to-many relationship with the Interaction. So one interaction may have one or more seedSets (one for each mature fruit of a flower exposed to a single visitor), otherwise we will limit the interactions to always recorded at flower level (one interaction = one animal visiting one flower).

So we have (at least) two options:

  • Make clear that it has a 1-to-many relationship with the Interaction or
  • Change definition so it has a 1-to-1 relationship with the Interaction (but how?)

@cepnues wrote on #37:

I don't know if it would help or just make the data insertion form too more complicated, but I believe seedSet as well as other terms (listed by @pjbergamo) could be split in at least three boxes: one with the absolute number of the trait measures, one with the number of flowers observed for this entry, and other with a time measures (which could vary from minutes to a whole flowering season). As an example for seedSet, we could have something like [243] seed(s) per [1] fruit(s) per [1] flowering season. In this case, the number of seeds would be a number as well as the other two boxes. The box with the time measure would then have to be followed by another box with controlled vocabulary if we want to include time measurements other than a single flowering season.

This can be a solution for change the term from 1-to-many to 1-to-1. But will it cover all use cases?

@pjbergamo wrote on #37:

For synthesis purposes, the important data are: average, variation (best always standard deviation) and number of samples.
If it is clear that one should complete the database with raw data (i.e. each seed set data measured for each individual instead of an average seed set for the population), all these data can then be retrieved afterwards.

Yes. We should look for raw data (not aggregated/summarized data) and in that case 1-to-many can give us more detailed information. However, we can define it to be the total number of seeds in the fruit set. What do you think? Do we have this strict relation between fruitSet and seedSet (seeds are counted for the fruits in the seed set - always)

removedPollenGrainQuantity

Definition: The total number pollen grains removed from the anther(s) of a flower

@cepnunes wrote on #36 :

like in some other terms (like #37 ), I believe that this term only make sense when in relationship to another measurement, such as total amount of pollen per flower. Moreover, I believe very few studies present this data, but with the methods for counting pollen grains becoming easier and more accessible, there will be more studies with this information in the future.

@anselmoeco wrote on #36:

Another point is that the studies that measure this descriptor make the data available by flower, or by groups of anthers of the same flower (e.g. larger or smaller anthers), or even by anther. So it would be good to avoid this misunderstanding in the definition of that term.

@pjbergamo:

The issues here can be solved as the issues raised in conspecific pollen grains on stigmas, pollen tubes, etc. Here, the relevant data would be the percentage left or removed (just chose one and go with it) - e.g. 40% pollen grains removed (or 60% remained) in a flower.
My suggestion is to keep it at flower level (in other words, pollen removal estimate coming from all anthers, which would cover the majority of the cases - in some flowers it is just impossible to estimate pollen removal anther per anther).

At the flower level means 1-to-many with the plant occurrence?

fruitSet

Definition: Proportion of the flowers exposed to floral visitors that yielded fruits

Can we use the number of flowers that set fruit instead of the proportion. A proportion is not the data actually measured, it is calculated from other two measures: the number of flowers marked and the number of flowers that set fruit.

@pjbergamo wrote on #34:

2.3) Fruit set (multiple visits)
Fruit set is not related to the number of flowers exposed to animal
visitors. Usually, fruit set is measured by marking a subset of the flowers
available (it is very hard to mark all available flowers, especially in
trees or shrubs with multiple branches) and, from these subset of flowers
one will then count how many set fruit.

So, do we need a term for documenting the number of marked flowers (the subset of the flowers available)?
If the fruit set is not related to the number of flowers exposed to animal visitors it must be spelled out on the definition. The way it is now, it makes a clear statement that fruit set is about the flowers exposed to floral visitors, so it is related to the flowers exposed to animal visitors by definition!

In @pjbergamo explanation we have Usually, fruit set is measured by marking a subset of the flowers. Is is a subset of the flowers of a given individual or flowers from a population? One thing is to say that a fruitSet is specific of particular individual(s) (dwc:Occurrence with dwc:organismQuantity >= 1), another thing is to say that the fruitSet is specific to flowers sampled from a plant population without the related dwc:Occurrence(s). In the first case the fruitSet is linked to the Occurrences, in the last it is not linked to anything (it is just a decontextualized measurement).

@pjbergamo also wrote on #34:

2.4) fruitMass, seedSet, seedMass
These measurements are not proportions and should be treated as
cospecificPollenGrains (and the other pollen-related reproductive success
variables: pollen tubes, fertilized ovules and so on).

I'm assuming that he is talking about multiple visits terms, and so, fruitMass, seedSet and seedMass should be treated as traits. I have exposed some problems with that above: 1-to-many or 1-to-1.

@cepnunes thank you for contributing.

Honestly, I'm not a fan of aggregated measurements, since there are not raw data we will miss a lot of information. But if this proved to be the best solution that we can figure out at this moment, then I will not opposed to that.

However, even using aggregated measurements does not solve the problem of which of the entities (Interaction or Occurrence) these terms (conspecificPollenGrains, fruitSet) should be linked with:

  • By linking conspecificPollenGrains, fruitSet or any other reproductive success term to particular Interaction means that these values (average our raw values) are specific of that particular Interaction. One Interaction is not limited to an interaction between just two individuals. It can be an interaction between two groups of taxonomic homogenous organisms, as shown in the Figure in my last comment.
  • By linking reproductive success terms to the plant Occurrence means that these values (average or raw) are specific of a particular Occurrence. Since one particular Occurrence can represent an occurrence of multiple taxonomic homogeneous organisms (dwc:organismQuantity) these values will represent measurements considering the individuals in the Occurrence. Despite these measurements are strictly related to interactions, since some interaction has to happen so we can have fruitSet, conspecificPollenGrains, etc. measurements, doing it we will lost any information about which interactions (including the pollinator's information) are responsible for the reproduction success of the individuals in the Occurrence.

I don't know which solution fit best the most common use cases. This is something that we have to figure out by investigating different use cases and examples of datasets.

And as you already have mention, we also should search for a solution which makes the data input/digitization simple in the user's perspective, even if some abstraction can be developed on top of that by the Information Systems.

thanks.