Added documentation to support userse#33
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds user-facing documentation for the synthetic NCT01797120 (PrE0102) dataset package under data/protocol/NCT01797120, including a concise top-level README and additional context inside the test_data folder.
Changes:
- Added a
test_data/README.mddescribing the synthetic dataset and summarizing the source trial. - Added a
test_data/FEEDBACK.mdcapturing reviewer feedback that informed dataset/script adjustments. - Simplified
data/protocol/NCT01797120/README.mdto describe the directory contents and how synthetic data is generated.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
data/protocol/NCT01797120/test_data/README.md |
New README describing the synthetic dataset and trial summary. |
data/protocol/NCT01797120/test_data/FEEDBACK.md |
New feedback log documenting issues/expectations for SDTM/ADaM-like outputs. |
data/protocol/NCT01797120/README.md |
Replaced the prior detailed dataset summary with a brief directory-level overview. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -0,0 +1,34 @@ | |||
| # Recent feedback resulting in the new script generation for these datasets. | |||
|
|
|||
| The `../scripts/cdisc_generation_functions.py` file hass been changed in accrodance with the feedback below. All CSV files in this direcetory have been generated with the latest `../scripts/cdisc_generation_functions.py`. | |||
| |---------------|-----------| | ||
| | RANDOMIZED | This is a value in the Protocol Milestone codelist. The `DSCAT` for the record would be "PROTOCOL MILESTONE" | | ||
| | TREATMENT | I don't understand what a record this value is supposed to mean. This is not a value in any of the codelists for `DSDECOD`. Dates are between those for "RANDOMIZED" records and records with other `DSDECOD` values, but are anywhere from a few weeks to a few months after the "RANDOMIZED" record. | | ||
| | PROGRESSIVE DISEASE | This is a value in the Completion/Reason for Non-Completion code. The `DSCAT` for the record would be "DISPOSITION EVENT". We would also expect a `DSSCAT` value, probably "STUDY TREATMENT", since in this study, subjects are followed (ideally) until death, even if they've stopped treatment. Most subjects seem to two have records for the same date with `DSDECOD = "PROGESSIVE DISEASE"`, one with `EPOCH = "TREATMENT"` and one with `EPOCH = "FOLLOW-UP"`. This doesn't make sense since any particular date falls into only on EPOCH, and there is only one disposition event of ending treatment for progressive disease. Actually, since there are two treatments in the study, if the treatments were stopped at different times, it would be possible to have two disposition events for ending treatment, one with `DSSCAT = "FULVESTRANT"` and one with `DSSCAT = "EVEROLIMUS"`. | |
|
|
||
| Reviewers would expect to have `DTHFL` included in the `DM` dataset and `DTHDTC` to be populated if `DTHFL = "Y"`. Admittedly, the fact that a patient died is usually collected in some other domain (probably `DS`), and added to `DM`. | ||
|
|
||
| `TRT01A` is an ADaM variable, not an SDTM variable. the arm to which a subject was randomized would be represented in some combination of `ARMCD`, `ARM`, `ACTARMCD`, and `ACTARM`. `ARMCD` and `ARM` are code and text for an arm, as are `ACTARMCD` and `ACTARM`. `ARMCD/ARM` are the same as `ACTARMCD/ACTARM` unless a subject receives no treatment (in which case `ACTARMCD/ACTARM` are null) a subject receives a treatment other than that to which they were randomized. I don't think we need to build the treated-wrong situation into the synthetic data, although I think the study included a couple of subjects who were never treated. What's currently in `TRT01A` would probably be in `ACTARM` and `ARM`. |
|
|
||
|
|
||
| The `EX` dataset is missing `EXFREQ`. | ||
| `EXFREQ` for fulvestrant injections would likely be "ONCE" with a record for each injection, and `EXSTDTC = EXENDTC`. Given the dosing schedule (Cycle 1 Day 1, Cycle 1 Day 15, then Day 1 of every subsequent cycle), the minimum number of records would be two, one for the first two doses given at a frequence of every 14 days, and a second for all the remaining injections, with a frequency of every 28 days. Practically, since patient visits drift off-schedule the one record per dose approach is probably more practical. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Added documents to help support users.