Skip to content

#30 backfill: gut/rhizosphere cohort batch 1 (8 communities)#91

Open
realmarcin wants to merge 1 commit into
mainfrom
backfill-gut-rhizosphere-batch1
Open

#30 backfill: gut/rhizosphere cohort batch 1 (8 communities)#91
realmarcin wants to merge 1 commit into
mainfrom
backfill-gut-rhizosphere-batch1

Conversation

@realmarcin
Copy link
Copy Markdown
Contributor

Starts the gut/rhizosphere arm of the #30 related_ingredients backfill to broaden ENVO/substrate coverage (continues #90 and the metals cohort #79/#80/#81/#83).

Every entry uses a CHEBI term verified live against the ChEBI sqlite db via OAK, with snippets copied verbatim from cached PMID/DOI abstracts.

related_ingredients adoption: 33/265 → 41/265.

Communities (8)

Community Ingredients (CHEBI-verified)
Bacteroides_Methanobrevibacter_Gnotobiotic_Mouse_Mutualism formate, methane, acetate, fructan
SIHUMIx_Human_Intestinal_Model_Community acetate, butyrate, carbohydrate, choline
Arabidopsis_Coumarin_Root_SynCom coumarin, iron(3+)
Maize_Benzoxazinoid_Metabolizing_SynComs 6-methoxy-2-benzoxazolinone (MBOA)
Infant_Gut_HMO_SynCom oligosaccharide (HMO), organic acid, carbohydrate
Engineered_Gut_Amino_Acid_CrossFeeding_Consortium amino acid
Avena_Rhizosphere_CrossKingdom_SIP_Community carbon dioxide (¹³C SIP substrate)
GLBRC_Exometabolite_Transwell_SynCom antibacterial agent (exometabolite class)

Curation note

Several gut/rhizosphere abstracts name few specific compounds, so some entries fall back to broad CHEBI classes (carbohydrate, amino acid, organic acid, antibacterial agent) rather than invent specific-compound snippets. These are honest, evidence-backed, and improvable when richer references are cached. The agents OAK-verified every id and refused to fabricate specifics (DIMBOA, scopoletin, specific HMOs/SCFAs were dropped when not named verbatim in the cited abstract).

Test plan

  • just test → 136 passed, 9 skipped
  • All 8 files validate clean (linkml-validate)

🤖 Generated with Claude Code

Starts the gut/rhizosphere arm of the #30 related_ingredients backfill
to broaden ENVO/substrate coverage. Every entry uses a CHEBI term
verified live against the ChEBI sqlite db via OAK, with snippets copied
verbatim from cached PMID/DOI abstracts. No cross-repo IDs.

related_ingredients adoption: 33/265 -> 41/265.

| Community | Ingredients (CHEBI-verified) |
|---|---|
| Bacteroides_Methanobrevibacter_Gnotobiotic_Mouse_Mutualism | formate, methane, acetate, fructan |
| SIHUMIx_Human_Intestinal_Model_Community | acetate, butyrate, carbohydrate, choline |
| Arabidopsis_Coumarin_Root_SynCom | coumarin, iron(3+) |
| Maize_Benzoxazinoid_Metabolizing_SynComs | 6-methoxy-2-benzoxazolinone (MBOA) |
| BioModels_..._Infant_Gut_HMO_SynCom | oligosaccharide (HMO), organic acid, carbohydrate |
| Engineered_Gut_Amino_Acid_CrossFeeding_Consortium | amino acid |
| Avena_Rhizosphere_CrossKingdom_SIP_Community | carbon dioxide (13C SIP substrate) |
| GLBRC_Exometabolite_Transwell_SynCom | antibacterial agent (exometabolite class) |

Several gut/rhizosphere abstracts name few specific compounds, so some
entries use broad CHEBI classes (carbohydrate, amino acid, organic acid,
antibacterial agent) rather than invent specific-compound snippets.
These are honest, evidence-backed, and improvable when richer references
are cached.

Test plan: just test (136 passed, 9 skipped), all 8 files validate clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 29, 2026 04:29
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR continues the #30 related_ingredients backfill by adding evidence-backed ChEBI ingredient links to 8 gut/rhizosphere community records, broadening environmental and substrate coverage without changing schema or code.

Changes:

  • Adds related_ingredients blocks to 8 community YAML files.
  • Uses ChEBI terms plus evidence snippets for gut, rhizosphere, SynCom, and exometabolite communities.
  • Omits mediaingredientmech_id, consistent with prior backfill batches where cross-repo IDs have not yet been minted.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
kb/communities/SIHUMIx_Human_Intestinal_Model_Community.yaml Adds acetate, butyrate, carbohydrate, and choline ingredient links.
kb/communities/Maize_Benzoxazinoid_Metabolizing_SynComs.yaml Adds MBOA as the central benzoxazinoid substrate.
kb/communities/GLBRC_Exometabolite_Transwell_SynCom.yaml Adds antibacterial agent/antibiotics as an exometabolite class.
kb/communities/Engineered_Gut_Amino_Acid_CrossFeeding_Consortium.yaml Adds amino acid as the engineered cross-feeding substrate class.
kb/communities/BioModels_MODEL2405300001_Infant_Gut_HMO_SynCom.yaml Adds HMO-related oligosaccharide, organic acid, and carbohydrate links.
kb/communities/Bacteroides_Methanobrevibacter_Gnotobiotic_Mouse_Mutualism.yaml Adds formate, methane, acetate, and fructan links for the gut mutualism.
kb/communities/Avena_Rhizosphere_CrossKingdom_SIP_Community.yaml Adds carbon dioxide as the labeled SIP substrate.
kb/communities/Arabidopsis_Coumarin_Root_SynCom.yaml Adds coumarin and ferric iron links for the root SynCom.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants