Skip to content

Fix missing prerequisite edges for units using enrolment_rules#6

Merged
jason301c merged 2 commits into
monashcoding:mainfrom
coldfinity:main
Jun 4, 2026
Merged

Fix missing prerequisite edges for units using enrolment_rules#6
jason301c merged 2 commits into
monashcoding:mainfrom
coldfinity:main

Conversation

@coldfinity
Copy link
Copy Markdown
Contributor

@coldfinity coldfinity commented May 21, 2026

Summary

  • 211 units (MTH2021, MTH2010, and others across Science, Pharmacy, and Education) store their PREREQUISITE and PROHIBITION relationships as HTML prose in enrolment_rules rather than the structured requisites field. This left requisite_refs empty for those units, so the tree graph showed no edges and "what does X unlock" views omitted them entirely.
  • Fixed the ingest pipeline to extract unit code refs from enrolment_rules descriptions that carry a <strong>PREREQUISITE</strong> or <strong>PROHIBITION</strong> label with handbook unit links.
  • Added migration 0007 to backfill all existing years — inserting 4,496 new prerequisite refs and 1,683 prohibition refs.
  • Fixed a bug in requisite-tree-view where the units map title lookup was computed but leaf.academic_item_name was still rendered instead of the resolved name.

Test plan

  • Navigate to /tree?unit=MTH1030&direction=downstream — MTH2021 and MTH2010 now appear as nodes
  • Click MTH2021 or MTH2010 in the tree — side panel shows MTH1030/MTH1035/ENG1005 as prerequisites in the enrolment rules section
  • Switching between 2025 and 2026 handbook years both show edges correctly (restart dev server to clear Next.js data cache after migration)

…ent_rules

211 units (MTH2021, MTH2010, and others across Science, Pharmacy, and
Education) store their PREREQUISITE and PROHIBITION relationships as HTML
prose in enrolment_rules rather than the structured requisites field.
This left requisite_refs empty for those units, so the tree graph showed
no edges and "what does X unlock" views omitted them entirely.

Fix the ingest pipeline to also extract unit code refs from enrolment_rules
descriptions that carry a <strong>PREREQUISITE</strong> or
<strong>PROHIBITION</strong> label with handbook unit links. Add a data
migration (0007) to backfill all existing years — inserting 4,496 new
prerequisite refs and 1,683 prohibition refs.

Also fix a bug in requisite-tree-view where the units map title lookup
was computed but leaf.academic_item_name was still rendered instead of
the resolved name.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 21, 2026

@coldfinity is attempting to deploy a commit to the monash-coding's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coldfinity coldfinity marked this pull request as draft May 21, 2026 11:29
@coldfinity coldfinity closed this May 21, 2026
@coldfinity coldfinity reopened this May 21, 2026
@coldfinity coldfinity marked this pull request as ready for review May 21, 2026 11:51
… hosts, ingest/SQL parity

Validated the first pass against the live corpus (31,098 enrolment_rules rows,
2020-2026) and found three correctness gaps:

- Whole-description classification mislabeled edges. 121 descriptions mix
  PREREQUISITE + PROHIBITION (81/32 also mix CO-REQUISITE), so labelling the
  whole blob by its first <strong> tag put ~126 links under the wrong type
  (e.g. a prohibited unit shown as a prerequisite). Now split each description
  at every <strong> label and classify per section.

- The migration and the ingest parser used different logic, so a re-ingest
  would rewrite the backfilled rows. They now share identical extraction --
  verified byte-identical over the whole corpus (6,624 refs, 0 drift).

- Only handbook.monash.edu unit URLs were matched. Recover legacy hosts
  (www[3].monash.edu/pubs/.../units/CODE.html); still ignore /courses/ and
  /aos/ links that appear in the same prose, and drop self-references
  (105 artifacts like CHM3990's own corequisite). Also extract CO-REQUISITE
  (591 edges).

Migration dry-run (rolled back) inserts 6,623 rows, near-disjoint from the
structured requisite_refs (1 incidental overlap, no-op under ON CONFLICT).

Also:
- Extract the parser into a tested extractEnrolmentRuleRefs() with regression
  cases for each gotcha (CIV4283 mixed labels, MTH2010 course-link exclusion,
  CHM3990 self-coreq, plain-text codes left unparsed).
- Revert the unrelated pnpm-lock.yaml / pnpm-workspace.yaml churn (drizzle-orm
  pin relaxation + allowBuilds block) that rode along from the fork's main.
- Document the enrolment_rules quirk in docs/handbook-internals.md.
@jason301c
Copy link
Copy Markdown
Collaborator

Hey @coldfinity good catch on this. Some units do structure their prereqs as prose instead of a structured format. I've also pushed a follow up commit to better classify them and also ensure future ingestions would follow the same extraction path. Merging

tysm <3

@jason301c jason301c merged commit bbf8ec1 into monashcoding:main Jun 4, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants