Fix missing prerequisite edges for units using enrolment_rules#6
Merged
Conversation
…ent_rules 211 units (MTH2021, MTH2010, and others across Science, Pharmacy, and Education) store their PREREQUISITE and PROHIBITION relationships as HTML prose in enrolment_rules rather than the structured requisites field. This left requisite_refs empty for those units, so the tree graph showed no edges and "what does X unlock" views omitted them entirely. Fix the ingest pipeline to also extract unit code refs from enrolment_rules descriptions that carry a <strong>PREREQUISITE</strong> or <strong>PROHIBITION</strong> label with handbook unit links. Add a data migration (0007) to backfill all existing years — inserting 4,496 new prerequisite refs and 1,683 prohibition refs. Also fix a bug in requisite-tree-view where the units map title lookup was computed but leaf.academic_item_name was still rendered instead of the resolved name. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@coldfinity is attempting to deploy a commit to the monash-coding's projects Team on Vercel. A member of the Team first needs to authorize it. |
… hosts, ingest/SQL parity Validated the first pass against the live corpus (31,098 enrolment_rules rows, 2020-2026) and found three correctness gaps: - Whole-description classification mislabeled edges. 121 descriptions mix PREREQUISITE + PROHIBITION (81/32 also mix CO-REQUISITE), so labelling the whole blob by its first <strong> tag put ~126 links under the wrong type (e.g. a prohibited unit shown as a prerequisite). Now split each description at every <strong> label and classify per section. - The migration and the ingest parser used different logic, so a re-ingest would rewrite the backfilled rows. They now share identical extraction -- verified byte-identical over the whole corpus (6,624 refs, 0 drift). - Only handbook.monash.edu unit URLs were matched. Recover legacy hosts (www[3].monash.edu/pubs/.../units/CODE.html); still ignore /courses/ and /aos/ links that appear in the same prose, and drop self-references (105 artifacts like CHM3990's own corequisite). Also extract CO-REQUISITE (591 edges). Migration dry-run (rolled back) inserts 6,623 rows, near-disjoint from the structured requisite_refs (1 incidental overlap, no-op under ON CONFLICT). Also: - Extract the parser into a tested extractEnrolmentRuleRefs() with regression cases for each gotcha (CIV4283 mixed labels, MTH2010 course-link exclusion, CHM3990 self-coreq, plain-text codes left unparsed). - Revert the unrelated pnpm-lock.yaml / pnpm-workspace.yaml churn (drizzle-orm pin relaxation + allowBuilds block) that rode along from the fork's main. - Document the enrolment_rules quirk in docs/handbook-internals.md.
Collaborator
|
Hey @coldfinity good catch on this. Some units do structure their prereqs as prose instead of a structured format. I've also pushed a follow up commit to better classify them and also ensure future ingestions would follow the same extraction path. Merging tysm <3 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
enrolment_rulesrather than the structuredrequisitesfield. This leftrequisite_refsempty for those units, so the tree graph showed no edges and "what does X unlock" views omitted them entirely.enrolment_rulesdescriptions that carry a<strong>PREREQUISITE</strong>or<strong>PROHIBITION</strong>label with handbook unit links.0007to backfill all existing years — inserting 4,496 new prerequisite refs and 1,683 prohibition refs.requisite-tree-viewwhere theunitsmap title lookup was computed butleaf.academic_item_namewas still rendered instead of the resolvedname.Test plan
/tree?unit=MTH1030&direction=downstream— MTH2021 and MTH2010 now appear as nodes