Skip to content

Add cloud laboratory protocol value sets#63

Open
cmungall wants to merge 3 commits into
mainfrom
claude/kind-fermat-jbwgzh
Open

Add cloud laboratory protocol value sets#63
cmungall wants to merge 3 commits into
mainfrom
claude/kind-fermat-jbwgzh

Conversation

@cmungall

Copy link
Copy Markdown
Member

Summary

Adds comprehensive controlled vocabularies for cloud-laboratory research protocols, modeled on the operational taxonomy of remote/cloud labs such as Emerald Cloud Lab (ECL) and its Symbolic Lab Language (SLL).

Key Changes

  • New schema module: src/valuesets/schema/lab_automation/cloud_lab.yaml

    • Defines LabUnitOperationEnum: 40+ composable sample-manipulation primitives (Transfer, Aliquot, Mix, Incubate, Filter, Centrifuge, etc.)
    • Defines CloudLabExperimentEnum: 100+ higher-level experiment/assay functions organized by category (synthesis, separations/chromatography, spectroscopy, mass spectrometry, bioassays, crystallography, property measurement, cellular)
    • Maps to OBI, CHMO, and MMO ontology terms via meaning: fields where equivalent classes exist
    • Includes comprehensive descriptions and aliases for each operation and experiment type
  • Updated main schema: src/valuesets/schema/valuesets.yaml

    • Added import for the new lab_automation/cloud_lab module
  • Updated ontology term caches:

    • Added 24 new CHMO terms to cache/chmo/terms.csv
    • Added 3 new OBI terms to cache/obi/terms.csv

Implementation Details

  • Two complementary layers capture both low-level unit operations and high-level experiment functions
  • Cross-references to existing assay value sets under bio/assays/ (OBIAssayEnum, BAOBioassayEnum)
  • Uses neutral ontology mappings rather than vendor-specific identifiers
  • Status marked as DRAFT with contributor attribution
  • Follows project conventions for enum naming (CamelCase) and permissible values (UPPER_CASE)

https://claude.ai/code/session_01G74MMk5js9HDsGL3VbfAxQ

claude added 3 commits June 11, 2026 15:50
Add lab_automation/cloud_lab.yaml with two controlled vocabularies modeled
on the Emerald Cloud Lab / Symbolic Lab Language operational taxonomy:

- LabUnitOperationEnum: composable sample-preparation unit operations
  (transfer, aliquot, mix, incubate, filter, centrifuge, pellet, resuspend,
  lyophilize, etc.), complementing lab_automation/operations.
- CloudLabExperimentEnum: higher-level experiment/assay functions grouped by
  category (synthesis, separations, spectroscopy, mass spectrometry,
  bioassays, crystallography, property measurement, cellular), cross-referenced
  to bio/assays.

Wire the module into the top-level valuesets.yaml import list. Ontology
meaning: mappings to OBI/CHMO/MMO are being added in a follow-up.

https://claude.ai/code/session_01G74MMk5js9HDsGL3VbfAxQ
Map permissible values to ontology terms, every CURIE verified via OAK/OLS
against CHMO and OBI:
- LabUnitOperationEnum: mix, centrifuge, filter, degas, evaporate,
  lyophilize, grind.
- CloudLabExperimentEnum: chromatography/separations, electrophoresis,
  spectroscopy, mass spectrometry, and core bioassays (ELISA, qPCR,
  Western blot, DNA sequencing, PCR, BLI).

Add aliases: entries equal to the exact ontology label where it differs in
form from the title (verb vs noun, acronyms, en-dash variants) to keep
mappings label-consistent. Workflow/labeling primitives and values without a
precise, label-consistent ontology class are intentionally left unmapped.

https://claude.ai/code/session_01G74MMk5js9HDsGL3VbfAxQ
@claude

claude Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Claude finished @cmungall's task —— View job


Code Review Complete ✅

Reviewing this PR that adds cloud laboratory protocol value sets. Here's my review checklist:

  • Read project guidelines from CLAUDE.md
  • Review new schema file structure and content
  • Check main schema import changes
  • Analyze ontology cache updates
  • Validate schema syntax and conventions
  • Check for potential issues and provide feedback

Review Summary

This is a high-quality contribution that adds comprehensive controlled vocabularies for cloud laboratory automation. The implementation follows project conventions and best practices.

Strengths

Schema Design & Structure:

  • ✅ Follows LinkML conventions with proper imports, prefixes, and enum structures
  • ✅ Well-organized into two complementary layers: unit operations (40+ items) and experiments (100+ items)
  • ✅ Adheres to project naming conventions (CamelCase enums, UPPER_CASE permissible values)
  • ✅ Comprehensive descriptions and aliases for each operation/experiment

Ontology Integration:

  • ✅ Proper use of meaning: fields with 42 ontology mappings to CHMO, OBI, and MMO terms
  • ✅ Updated cache files correctly add 24 new CHMO terms and 3 new OBI terms
  • ✅ All ontology terms appear legitimate and properly formatted as CURIEs

Documentation & Metadata:

  • ✅ Thorough descriptions explaining the purpose and relationship to existing value sets
  • ✅ Appropriate contributor attribution and DRAFT status marking
  • ✅ Clear cross-references to existing bio/assays value sets

🔍 Minor Observations

Consistency:

  • ⚠️ Some permissible values have aliases (e.g., mixing, centrifugation) while semantically equivalent ones don't (e.g., FILTER vs filtration) - this is acceptable but could be standardized in future iterations

Coverage:

  • ✅ Good balance between comprehensive coverage and practical utility
  • ✅ Neutral ontology mappings rather than vendor-specific terms (excellent design choice)

🚀 No Issues Found

  • Code Quality: Schema syntax appears correct, follows LinkML patterns
  • Security: No security concerns - pure data definitions
  • Performance: Minimal performance impact - static enum definitions
  • Test Coverage: Schema-only changes don't require additional test coverage

📋 Recommendations

  1. Future Enhancement: Consider adding cross-references between related unit operations and experiments where applicable
  2. Documentation: The existing documentation is excellent - no changes needed

This PR significantly enhances the value set collection with well-structured, properly annotated laboratory automation vocabularies. Recommend approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants