Added documentation to support userse by pendingintent · Pull Request #33 · cdisc-org/360i

pendingintent · 2026-06-05T18:35:23Z

Added documents to help support users.

Copilot

Pull request overview

This PR adds user-facing documentation for the synthetic NCT01797120 (PrE0102) dataset package under data/protocol/NCT01797120, including a concise top-level README and additional context inside the test_data folder.

Changes:

Added a test_data/README.md describing the synthetic dataset and summarizing the source trial.
Added a test_data/FEEDBACK.md capturing reviewer feedback that informed dataset/script adjustments.
Simplified data/protocol/NCT01797120/README.md to describe the directory contents and how synthetic data is generated.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
`data/protocol/NCT01797120/test_data/README.md`	New README describing the synthetic dataset and trial summary.
`data/protocol/NCT01797120/test_data/FEEDBACK.md`	New feedback log documenting issues/expectations for SDTM/ADaM-like outputs.
`data/protocol/NCT01797120/README.md`	Replaced the prior detailed dataset summary with a brief directory-level overview.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -0,0 +1,34 @@
+# Recent feedback resulting in the new script generation for these datasets.
+
+The `../scripts/cdisc_generation_functions.py` file hass been changed in accrodance with the feedback below. All CSV files in this direcetory have been generated with the latest `../scripts/cdisc_generation_functions.py`.


+|---------------|-----------|
+| RANDOMIZED    | This is a value in the Protocol Milestone codelist.  The `DSCAT` for the record would be "PROTOCOL MILESTONE" |
+| TREATMENT     | I don't understand what a record this value is supposed to mean.  This is not a value in any of the codelists for `DSDECOD`. Dates are between those for "RANDOMIZED" records and records with other `DSDECOD` values, but are anywhere from a few weeks to a few months after the "RANDOMIZED" record. |
+| PROGRESSIVE DISEASE | This is a value in the Completion/Reason for Non-Completion code.  The `DSCAT` for the record would be "DISPOSITION EVENT".  We would also expect a `DSSCAT` value, probably "STUDY TREATMENT", since in this study, subjects are followed (ideally) until death, even if they've stopped treatment. Most subjects seem to two have records for the same date with  `DSDECOD = "PROGESSIVE DISEASE"`, one with `EPOCH = "TREATMENT"` and one with `EPOCH = "FOLLOW-UP"`.  This doesn't make sense since any particular date falls into only on EPOCH, and there is only one disposition event of ending treatment for progressive disease.  Actually, since there are two treatments in the study, if the treatments were stopped at different times, it would be possible to have two disposition events for ending treatment, one with `DSSCAT = "FULVESTRANT"` and one with `DSSCAT = "EVEROLIMUS"`. |


+
+Reviewers would expect to have `DTHFL` included in the `DM` dataset and `DTHDTC` to be populated if `DTHFL = "Y"`. Admittedly, the fact that a patient died is usually collected in some other domain (probably `DS`), and added to `DM`.
+
+`TRT01A` is an ADaM variable, not an SDTM variable.  the arm to which a subject was randomized would be represented in some combination of `ARMCD`, `ARM`, `ACTARMCD`, and `ACTARM`.  `ARMCD` and `ARM` are code and text for an arm, as are `ACTARMCD` and `ACTARM`.  `ARMCD/ARM` are the same as `ACTARMCD/ACTARM` unless a subject receives no treatment (in which case `ACTARMCD/ACTARM` are null) a subject receives a treatment other than that to which they were randomized.  I don't think we need to build the treated-wrong situation into the synthetic data, although I think the study included a couple of subjects who were never treated.  What's currently in `TRT01A` would probably be in `ACTARM` and `ARM`.


+
+
+The `EX`  dataset is missing `EXFREQ`.
+`EXFREQ` for fulvestrant injections would likely be "ONCE" with a record for each injection, and `EXSTDTC = EXENDTC`. Given the dosing schedule (Cycle 1 Day 1, Cycle 1 Day 15, then Day 1 of every subsequent cycle), the minimum number of records  would be two, one for the first two doses given at a frequence of every 14 days, and a second for all the remaining injections, with a frequency of every 28 days.  Practically, since patient visits drift off-schedule the one record per dose approach is probably more practical. 


Added documentation to support userse

4a5538a

Copilot AI review requested due to automatic review settings June 5, 2026 18:35

pendingintent self-assigned this Jun 5, 2026

pendingintent added the documentation Improvements or additions to documentation label Jun 5, 2026

Copilot started reviewing on behalf of pendingintent June 5, 2026 18:35 View session

pendingintent merged commit 6182d41 into main Jun 5, 2026
4 checks passed

Copilot AI reviewed Jun 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added documentation to support userse#33

Added documentation to support userse#33
pendingintent merged 1 commit into
mainfrom
pi-update-documents

pendingintent commented Jun 5, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1,34 @@
		# Recent feedback resulting in the new script generation for these datasets.

		The `../scripts/cdisc_generation_functions.py` file hass been changed in accrodance with the feedback below. All CSV files in this direcetory have been generated with the latest `../scripts/cdisc_generation_functions.py`.


		Reviewers would expect to have `DTHFL` included in the `DM` dataset and `DTHDTC` to be populated if `DTHFL = "Y"`. Admittedly, the fact that a patient died is usually collected in some other domain (probably `DS`), and added to `DM`.

		`TRT01A` is an ADaM variable, not an SDTM variable. the arm to which a subject was randomized would be represented in some combination of `ARMCD`, `ARM`, `ACTARMCD`, and `ACTARM`. `ARMCD` and `ARM` are code and text for an arm, as are `ACTARMCD` and `ACTARM`. `ARMCD/ARM` are the same as `ACTARMCD/ACTARM` unless a subject receives no treatment (in which case `ACTARMCD/ACTARM` are null) a subject receives a treatment other than that to which they were randomized. I don't think we need to build the treated-wrong situation into the synthetic data, although I think the study included a couple of subjects who were never treated. What's currently in `TRT01A` would probably be in `ACTARM` and `ARM`.



		The `EX` dataset is missing `EXFREQ`.
		`EXFREQ` for fulvestrant injections would likely be "ONCE" with a record for each injection, and `EXSTDTC = EXENDTC`. Given the dosing schedule (Cycle 1 Day 1, Cycle 1 Day 15, then Day 1 of every subsequent cycle), the minimum number of records would be two, one for the first two doses given at a frequence of every 14 days, and a second for all the remaining injections, with a frequency of every 28 days. Practically, since patient visits drift off-schedule the one record per dose approach is probably more practical.

Conversation

pendingintent commented Jun 5, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants