Sample Labelling
Following the protocol hierarchy we can define what labels on vials, items and files should look like:
ID of the potential file
Experiment ID actual ERI run extension (for files)
▲ ▲ ▲
│ │ │
┌───────┐ ┌────┴─────┐ ┌───────┐ ┌────┴────┐ ┌────┐ ┌───┴──┐
│PROJECT├──┤EXPERIMENT├──┤ E.R.I ├──┤REPLICATE├──┤leaf├──┤[.ext]│
└───┬───┘ └──────────┘ └───┬───┘ └─────────┘ └─┬──┘ └──────┘
│ │ │
▼ ▼ ▼
Project ID Experiment Unique sample-specific
Realization string of text
ID
For example, assume we are studying how in ancient Asian tea was brewed.
We have found several tea leaves in a tomb and we are exploring the taste of
different tea-brewing techniques.
The overarching project is called Ancient Tea, and we decide that its ID will
be ANTEA
.
It's possible to choose any string of text for a project ID, but it must be
universally unique. If we will do another tea-related project in the future,
we must call it something different than ANTEA
(and possibly something that is
not easily confused with the original, like ANTEA2
).
We think that we need an experiment for taste-testing brewed tea, so we define
the experiment with ID tea_taste
for it.
We then define the protocols that we need:
- Wash liquid containers;
- Heat up liquid: heat up to some degrees some liquid using the electic kettle;
- Brew tea: place tea leaves in hot water and brew them for some time;
- Taste substance: use the standard questionnaire to profile the taste of some substance.
We can now realize the experiment for this specific tea-related case:
- Wash cups: to take dirty cups and wash them;
- Heat up water: to take water and cups, and make cups with hot water inside and heating it up to 85 degrees Celsius, to not destroy the fragile leaves;
- Brew tea: brew for 5 minutes while stirring every minute.
- Taste hot tea: to take brewed tea and taste-test it, and record the taste. We need an additional section of the questionnaire for the specific bitterness of the tea.
This is the realization of the experiment. In the realized experiment we
have defined all the variables that are kept purpusefully vague in the generic
experiment.
We assign an ID to the realization too, making sure that the combo EXPERIMENT
plus E.R.I.
is unique. For example, we can use tea_taste
plus ancient
,
creating the string tea_taste-ancient
for this realization of the protocol.
When we actually do the run, we give it an ID. Since this is the first time
that the tea_taste-ancient
experiment is run in the context of the ANTEA
project, we give it the ID of 1
.
At this point, the full ID for this run is ANTEA-tea_taste-ancient-1
.
Labelling sigle data points
Each experimental run may have one ore more associated things, such as samples, vials, measurements, etc... It's important to label each item in the experimental run with a specific leaf, a string with information about it.
How to structure leaves is less strictly defined as we need to allow for
as much flexibility as possible.
However, it's a good idea to continue with the principle of hierarchy.
To continue with the above example, say that we have brewed five different
cups (cup_1
through cup_5
) by taking different tea leaves aliquotes (black,
green). Each cup is then idependently tested by five different people
(taster_a
, taster-b
, taster_c
, etc...).
We might have leaf-ids that look like this:
black-cup_1-taster_a
black-cup_1-taster_b
black-cup_1-taster_c
...
black-cup_2-taster_a
black-cup_2-taster_b
...
green-cup_1-taster_a
...
green-cup_5-taster_e
Going from highest ranking ("type of tea") to intermediate ("cup number") to the most fine ("taster ID").
Each measurement (in this case "how good the tea was") is associated with the
full ID. For example, taster_e
has rated 5/5 the tea from cup 2 made from
black tea leaves.
Thus we record the score like this:
ID, score
black-cup_2-taster_e 5
What to include in the leaf
The above "leaf" is made up of metadata. However, how much metadata should be included in the leaf is up to the experimenter. A leaf identifier should be primarily useful for the human experimenter that is performing the run to know at a glance what is in this vial, what this measurement is about, and/or things useful for housekeeping, such as when the vial was collected or stored.
In any case, all metadata about the experiment should be saved in separate metadata files, as detailed in what is metadata?. This means that the leaf might be as short and as obscure as a single number, and as long as containing all metadata variables: as long as it is unique for the specific thing it is describing, it is not really important what form it takes.