docs: improve README example#201
Conversation
embray
commented
Jun 22, 2026
- fixes Running cube example in readme can crash on some files #196, with better guard against empty history entries
- improves the tile cut-out example, demonstrating how to use asdf_value_find, ndarray->ndim and ndarray->shape, making the example more robust to arbitrary ASDF files (so long as they have an ndarray)
- better document that the program can be run on cube.asdf from the source code test fixtures and the expected output
|
@SAY-5 @braingram would you like to give the updated example a look? Arguably it's still a bit long, but pretty comprehensive at least. |
|
Thanks for making these update. I think we should simplify this example and make it not depend on any file. The current example covers:
Thinking of this as a first example a user will encounter I think this can be improved by removing some of the more "advanced" functions including:
Instead the example could be:
|
That sounds fine. I think it would still be good to include an example of how to read out of the YAML too, but something simpler. I'll add two separate examples: one demonstrating how to read from YAML, and one demonstrating how to read/write a simple ndarray. The more comprehensive example can be moved to a separate page in the documentation. |
|
@braingram Good suggestion. I think the new examples are much better, more digestible and tutorial-friendly now: https://github.com/embray/libasdf/blob/2bcf06e0380936c335a90617e5e18cb466d87e42/README.rst Happy to further refine. |
|
|
||
| .. code:: c | ||
| :name: test-read-metadata-ndarray | ||
| :test: test-write-file |
There was a problem hiding this comment.
Splitting the example into 2 programs (asdf-write and asdf-read) improves the approach-ability of the example(s) and makes them easier to follow.
However, this produces an invalid file (the serialized time is invalid).
'id': 'http://stsci.edu/schemas/asdf/time/time-1.4.0',
'title': 'Represents an instance in time.'}
On instance:
{'value': 'J1948.78707178', 'format': 'iso_time'}
The format also doesn't match roman files yet the example mentions Roman which is misleading. Roman files do not have any "obstime" or "observer" keys and the "data" is nested under a parent "roman" key.
Let's avoid linking libasdf to roman here in the README and cover that elsewhere (where roman files can be used).
I'd also say we avoid anything FITS-like here and cover that elsewhere.
How about an example file similar to the python example with a tree containing?:
- name: Dennis Ritchie
- foo: 42
- sequence: array of numbers 1-99
- powers/squares: squares of sequence
(leaving out random since it seems overly complicated for this example)
The write could cover:
- setting a scalar string/number
- creating an array
- nesting data
Read could cover accessing all of these values (and perhaps summing the sequence and squares).
There was a problem hiding this comment.
Hmm, I'm not so sure about any of this to be honest.
On instance:
{'value': 'J1948.78707178', 'format': 'iso_time'}
I don't know how you got that. Might be using an older version of the library? The generated file has a copy committed in the repository (for comparison in the tests). It clearly outputs:
obstime: !time/time-1.4.0
value: J1948.78707178
format: jyear
This is valid according to the schema (which I think itself is a bit buggy, but, the fact that this is accepted seems reasonable): https://github.com/asdf-format/asdf-standard/blob/main/resources/stable/schemas/stsci.edu/asdf/time/time-1.4.0.yaml
Though interestingly when I tried to open the file in Python I got a different exception -- it passed schema validation and threw:
ValueError: Input values did not match the format class jyear:
TypeError: for jyear class, input should be (long) doubles, string, or Decimal, and second values are only allowed for (long) doubles.
This looks like a bug through and through. Makes no sense.
I could try changing the example to a different time format for now just to avoid any confusion from this, but will investigate the upstream bug...
The format also doesn't match roman files yet the example mentions Roman which is misleading. Roman files do not have any "obstime" or "observer" keys and the "data" is nested under a parent "roman" key.
This statement is also a bit confusing to me. It's just using Nancy Roman as an example "observer". I guess her name is on my mind for obvious reasons. I don't think anyone is ever confused in FITS examples when "OBSERVER = Edwin Hubble" that the example represents a valid HST data product.
But sure I could change it to Dennis Ritchie or Daffy Duck for all it matters.
How about an example file similar to the python example with a tree containing?
I like the idea of exactly mirroring the Python example in principle. But unfortunately I don't think it translates as well to the C library in a succinct way. In Python, preparing a dictionary of small ndarrays is practically a one-liner. In C each one of those examples is several lines of setup and teardown code that has to be managed, bogging down the simplicity of the example.
The above statement is maybe also a good argument for adding some helper macros for creating simple ndarrays, or mapping builders. This is something I've been thinking about, but probably out of scope for now.
There was a problem hiding this comment.
Hmm, I'm not so sure about any of this to be honest.
On instance:
{'value': 'J1948.78707178', 'format': 'iso_time'}I don't know how you got that. Might be using an older version of the library?
Likely yes. Pulling down fresh code and recompiling the library I don't see the odd 'iso_time'.
The generated file has a copy committed in the repository (for comparison in the tests). It clearly outputs:
obstime: !time/time-1.4.0 value: J1948.78707178 format: jyearThis is valid according to the schema (which I think itself is a bit buggy, but, the fact that this is accepted seems reasonable): https://github.com/asdf-format/asdf-standard/blob/main/resources/stable/schemas/stsci.edu/asdf/time/time-1.4.0.yaml
I wouldn't trust this schema to be a complete description of how time formats are handled. This is another reason to avoid using time in this example as that code has not been fully vetted.
This statement is also a bit confusing to me. It's just using Nancy Roman as an example "observer". I guess her name is on my mind for obvious reasons. I don't think anyone is ever confused in FITS examples when "OBSERVER = Edwin Hubble" that the example represents a valid HST data product.
But sure I could change it to Dennis Ritchie or Daffy Duck for all it matters.
How about an example file similar to the python example with a tree containing?
I like the idea of exactly mirroring the Python example in principle. But unfortunately I don't think it translates as well to the C library in a succinct way. In Python, preparing a dictionary of small ndarrays is practically a one-liner. In C each one of those examples is several lines of setup and teardown code that has to be managed, bogging down the simplicity of the example.
Here's the candidate examples with some inline comments to help readers associate API with specific operations:
#include <asdf.h>
int main(int argc, char **argv) {
const char *filename = "out.asdf";
// open a "NULL" file for writing
asdf_file_t *file = asdf_open(NULL);
// assign a string to the "name" key of the ASDF tree
asdf_set_string0(file, "name", "Dennis Richie");
// assign a numeric value to the "foo" key
asdf_set_int64(file, "foo", 42);
// construct 2 arrays containing numeric values
uint64_t N = 100;
asdf_ndarray_t sequence = {
.ndim = 1,
.shape = (uint64_t[]){N},
.datatype = {.type = ASDF_DATATYPE_UINT64}
};
uint8_t *sequence_data = asdf_ndarray_data_alloc(&sequence);
asdf_ndarray_t squares = {
.ndim = 1,
.shape = (uint64_t[]){N},
.datatype = {.type = ASDF_DATATYPE_UINT64}
};
uint64_t *squares_data = asdf_ndarray_data_alloc(&squares);
for (int idx = 0; idx < N; idx++) {
sequence_data[idx] = idx;
squares_data[idx] = idx * idx;
};
// assign the "sequence" array to the "sequence" key
asdf_set_ndarray(file, "sequence", &sequence);
// nest the "squares" array under a parent "powers" key
asdf_set_ndarray(file, "powers/squares", &squares);
// write the ASDF file to disk
asdf_write_to(file, filename);
// clean up allocations
asdf_ndarray_data_dealloc(&sequence);
asdf_ndarray_data_dealloc(&squares);
asdf_close(file);
return 0;
}and reading:
#include <stdio.h>
#include <stdlib.h>
#include <asdf.h>
int main(int argc, char **argv) {
const char *filename = "out.asdf";
// open the ASDF file for reading
asdf_file_t *file = asdf_open(filename, "r");
if (!file) {
fprintf(stderr, "Failed to open the file: %s\n", asdf_error(file));
return 1;
}
// read and print the string stored under "name"
const char *name = NULL;
if (asdf_get_string0(file, "name", &name) == ASDF_VALUE_OK) {
printf("name: %s\n", name);
}
// read and print the numeric value stored under "foo"
int64_t foo = 0;
if (asdf_get_int64(file, "foo", &foo) == ASDF_VALUE_OK) {
printf("foo: %lli\n", foo);
}
// read the "squares" array nested under the "powers" key
asdf_ndarray_t *squares = NULL;
uint64_t *squares_data = NULL;
if (asdf_get_ndarray(file, "powers/squares", &squares) == ASDF_VALUE_OK) {
if (asdf_ndarray_read_all(squares, ASDF_DATATYPE_UINT64, (void **)&squares_data) == ASDF_NDARRAY_OK) {
// print the sum of the squares array
uint64_t nelem = asdf_ndarray_size(squares);
uint64_t sum = 0;
for (uint64_t idx = 0; idx < nelem; idx++) {
sum += squares_data[idx];
}
printf("sum of squares values: %lli\n", sum);
}
}
// clean up allocations
free(squares_data);
asdf_ndarray_destroy(squares);
asdf_close(file);
return 0;
}I think it's helpful here to show:
- read/writing string and numeric values to the tree
- reading and writing arrays
- that ASDF is hierarchical (by nesting the "squares" array under "powers")
While working on gh-201 I wanted to use setting a time as a simple example but exercising the time extension more revealed further problems: - parsing doesn't work for jyear format times -- this is now added, as well as for the decimalyear format; leaving more formats to support for a follow-up issue - corrected not-quite-right handling of time values given as numerical scalars; rather than parsing their numerical values out of YAML and then re-formatting them, just take the YAML scalar verbatim, as the individual format parsers expect - fixed deserialization of the optional 'scale' property, which previously always defaulted to 'utc'. This was a certainly intended as a TODO but forgot to note it
- fixes asdf-formatgh-196, with better guard against empty history entries - improves the tile cut-out example, demonstrating how to use asdf_value_find, ndarray->ndim and ndarray->shape, making the example more robust to arbitrary ASDF files (so long as they have an ndarray) - better document that the program *can* be run on cube.asdf from the source code test fixtures and the expected output
and properly report failing doc example tests. This happened to work on the cmake version of this test since it doesn't rely on the generated wrapper script. asdf-formatgh-196
The new example programs give the user the opportunity to both write out an ASDF file from scratch, then read data back in from that same file (rather than relying on some ill-defined input file from somewhere). The examples are shorter and, I think, speak more for themselves in what's going on. It is more "tutorial-like" and might give an interested user a starting point to build on. The old example is kept but moved to a new "Examples" page in the docs, with an invitation to expand on it. Took this as an opportunity to make improvements to the "doctest" extraction and running process, since it was really kind of necessary to do a little better for this. Now a doctest can be marked explicitly with a `:test:` option, and can also be passed a particular file it needs to be run with in the test suite via `:fixture:`. This framework is interesting and could use even further enhancements maybe as a stand-alone sphinx extension down the line...
- avoid writing time with JXXXX year format and format: jyear as this triggers a bug in Python ASDF when trying to read the file (should be addressed in a follow-up) - change example observer name to avoid possible confusion (Dennis Richie does not have any telescopes named after him that I know of)