Skip to content

docs: improve README example#201

Open
embray wants to merge 7 commits into
asdf-format:mainfrom
embray:issue-196
Open

docs: improve README example#201
embray wants to merge 7 commits into
asdf-format:mainfrom
embray:issue-196

Conversation

@embray

@embray embray commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator
  • fixes Running cube example in readme can crash on some files #196, with better guard against empty history entries
  • improves the tile cut-out example, demonstrating how to use asdf_value_find, ndarray->ndim and ndarray->shape, making the example more robust to arbitrary ASDF files (so long as they have an ndarray)
  • better document that the program can be run on cube.asdf from the source code test fixtures and the expected output

@embray embray added the documentation Improvements or additions to documentation label Jun 22, 2026
@embray

embray commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator Author

@SAY-5 @braingram would you like to give the updated example a look?

Arguably it's still a bit long, but pretty comprehensive at least.

@braingram

Copy link
Copy Markdown
Contributor

Thanks for making these update. I think we should simplify this example and make it not depend on any file. The current example covers:

  • asdf_open
  • asdf_get_string0 (Reading "asdf_library/author")
  • asdf_get_meta and history entries
  • asdf_meta_destroy
  • asdf_get_value
  • asdf_value_find
  • asdf_value_as_ndarray
  • asdf_value_path
  • asdf_value_destroy
  • asdf_ndarray_data
  • asdf_ndarray_read_tile_ndim
  • asdf_ndarray_destroy
  • asdf_close

Thinking of this as a first example a user will encounter I think this can be improved by removing some of the more "advanced" functions including:

  • asdf_get_meta (and asdf_meta_destroy)
  • asdf_value_find
  • asdf_value_as_ndarray
  • asdf_value_path
  • asdf_value_destory
  • asdf_ndarray_data
  • asdf_ndarray_read_tile_ndim

Instead the example could be:

  • create a simple asdf file with 1 array at "data"
  • read back in the file and return a value from that "data" array (using asdf_ndarray_read_all)

@embray

embray commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator Author

create a simple asdf file with 1 array at "data"
read back in the file and return a value from that "data" array (using asdf_ndarray_read_all)

That sounds fine. I think it would still be good to include an example of how to read out of the YAML too, but something simpler.

I'll add two separate examples: one demonstrating how to read from YAML, and one demonstrating how to read/write a simple ndarray. The more comprehensive example can be moved to a separate page in the documentation.

@embray

embray commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

@braingram Good suggestion. I think the new examples are much better, more digestible and tutorial-friendly now: https://github.com/embray/libasdf/blob/2bcf06e0380936c335a90617e5e18cb466d87e42/README.rst

Happy to further refine.

Comment thread README.rst

.. code:: c
:name: test-read-metadata-ndarray
:test: test-write-file

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting the example into 2 programs (asdf-write and asdf-read) improves the approach-ability of the example(s) and makes them easier to follow.

However, this produces an invalid file (the serialized time is invalid).

     'id': 'http://stsci.edu/schemas/asdf/time/time-1.4.0',
     'title': 'Represents an instance in time.'}

On instance:
    {'value': 'J1948.78707178', 'format': 'iso_time'}

The format also doesn't match roman files yet the example mentions Roman which is misleading. Roman files do not have any "obstime" or "observer" keys and the "data" is nested under a parent "roman" key.

Let's avoid linking libasdf to roman here in the README and cover that elsewhere (where roman files can be used).

I'd also say we avoid anything FITS-like here and cover that elsewhere.

How about an example file similar to the python example with a tree containing?:

  • name: Dennis Ritchie
  • foo: 42
  • sequence: array of numbers 1-99
  • powers/squares: squares of sequence

(leaving out random since it seems overly complicated for this example)

The write could cover:

  • setting a scalar string/number
  • creating an array
  • nesting data

Read could cover accessing all of these values (and perhaps summing the sequence and squares).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not so sure about any of this to be honest.

On instance:
{'value': 'J1948.78707178', 'format': 'iso_time'}

I don't know how you got that. Might be using an older version of the library? The generated file has a copy committed in the repository (for comparison in the tests). It clearly outputs:

obstime: !time/time-1.4.0
  value: J1948.78707178
  format: jyear

This is valid according to the schema (which I think itself is a bit buggy, but, the fact that this is accepted seems reasonable): https://github.com/asdf-format/asdf-standard/blob/main/resources/stable/schemas/stsci.edu/asdf/time/time-1.4.0.yaml

Though interestingly when I tried to open the file in Python I got a different exception -- it passed schema validation and threw:

ValueError: Input values did not match the format class jyear:
TypeError: for jyear class, input should be (long) doubles, string, or Decimal, and second values are only allowed for (long) doubles.

This looks like a bug through and through. Makes no sense.

I could try changing the example to a different time format for now just to avoid any confusion from this, but will investigate the upstream bug...

The format also doesn't match roman files yet the example mentions Roman which is misleading. Roman files do not have any "obstime" or "observer" keys and the "data" is nested under a parent "roman" key.

This statement is also a bit confusing to me. It's just using Nancy Roman as an example "observer". I guess her name is on my mind for obvious reasons. I don't think anyone is ever confused in FITS examples when "OBSERVER = Edwin Hubble" that the example represents a valid HST data product.

But sure I could change it to Dennis Ritchie or Daffy Duck for all it matters.

How about an example file similar to the python example with a tree containing?

I like the idea of exactly mirroring the Python example in principle. But unfortunately I don't think it translates as well to the C library in a succinct way. In Python, preparing a dictionary of small ndarrays is practically a one-liner. In C each one of those examples is several lines of setup and teardown code that has to be managed, bogging down the simplicity of the example.

The above statement is maybe also a good argument for adding some helper macros for creating simple ndarrays, or mapping builders. This is something I've been thinking about, but probably out of scope for now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not so sure about any of this to be honest.

On instance:
{'value': 'J1948.78707178', 'format': 'iso_time'}

I don't know how you got that. Might be using an older version of the library?

Likely yes. Pulling down fresh code and recompiling the library I don't see the odd 'iso_time'.

The generated file has a copy committed in the repository (for comparison in the tests). It clearly outputs:

obstime: !time/time-1.4.0
  value: J1948.78707178
  format: jyear

This is valid according to the schema (which I think itself is a bit buggy, but, the fact that this is accepted seems reasonable): https://github.com/asdf-format/asdf-standard/blob/main/resources/stable/schemas/stsci.edu/asdf/time/time-1.4.0.yaml

I wouldn't trust this schema to be a complete description of how time formats are handled. This is another reason to avoid using time in this example as that code has not been fully vetted.

This statement is also a bit confusing to me. It's just using Nancy Roman as an example "observer". I guess her name is on my mind for obvious reasons. I don't think anyone is ever confused in FITS examples when "OBSERVER = Edwin Hubble" that the example represents a valid HST data product.

But sure I could change it to Dennis Ritchie or Daffy Duck for all it matters.

How about an example file similar to the python example with a tree containing?

I like the idea of exactly mirroring the Python example in principle. But unfortunately I don't think it translates as well to the C library in a succinct way. In Python, preparing a dictionary of small ndarrays is practically a one-liner. In C each one of those examples is several lines of setup and teardown code that has to be managed, bogging down the simplicity of the example.

Here's the candidate examples with some inline comments to help readers associate API with specific operations:

#include <asdf.h>

int main(int argc, char **argv) {
    const char *filename = "out.asdf";

    // open a "NULL" file for writing
    asdf_file_t *file = asdf_open(NULL);

    // assign a string to the "name" key of the ASDF tree
    asdf_set_string0(file, "name", "Dennis Richie");

    // assign a numeric value to the "foo" key
    asdf_set_int64(file, "foo", 42);

    // construct 2 arrays containing numeric values
    uint64_t N = 100;

    asdf_ndarray_t sequence = {
        .ndim = 1,
        .shape = (uint64_t[]){N},
        .datatype = {.type = ASDF_DATATYPE_UINT64}
    };
    uint8_t *sequence_data = asdf_ndarray_data_alloc(&sequence);

    asdf_ndarray_t squares = {
        .ndim = 1,
        .shape = (uint64_t[]){N},
        .datatype = {.type = ASDF_DATATYPE_UINT64}
    };
    uint64_t *squares_data = asdf_ndarray_data_alloc(&squares);

    for (int idx = 0; idx < N; idx++) {
        sequence_data[idx] = idx;
        squares_data[idx] = idx * idx;
    };

    // assign the "sequence" array to the "sequence" key
    asdf_set_ndarray(file, "sequence", &sequence);

    // nest the "squares" array under a parent "powers" key
    asdf_set_ndarray(file, "powers/squares", &squares);

    // write the ASDF file to disk
    asdf_write_to(file, filename);

    // clean up allocations
    asdf_ndarray_data_dealloc(&sequence);
    asdf_ndarray_data_dealloc(&squares);
    asdf_close(file);
    return 0;
 }

and reading:

#include <stdio.h>
#include <stdlib.h>
#include <asdf.h>

int main(int argc, char **argv) {
    const char *filename = "out.asdf";

    // open the ASDF file for reading
    asdf_file_t *file = asdf_open(filename, "r");
    if (!file) {
        fprintf(stderr, "Failed to open the file: %s\n", asdf_error(file));
        return 1;
    }

    // read and print the string stored under "name"
    const char *name = NULL;
    if (asdf_get_string0(file, "name", &name) == ASDF_VALUE_OK) {
        printf("name: %s\n", name);
    }

    // read and print the numeric value stored under "foo"
    int64_t foo = 0;
    if (asdf_get_int64(file, "foo", &foo) == ASDF_VALUE_OK) {
        printf("foo: %lli\n", foo);
    }

    // read the "squares" array nested under the "powers" key
    asdf_ndarray_t *squares = NULL;
    uint64_t *squares_data = NULL;
    if (asdf_get_ndarray(file, "powers/squares", &squares) == ASDF_VALUE_OK) {
        if (asdf_ndarray_read_all(squares, ASDF_DATATYPE_UINT64, (void **)&squares_data) == ASDF_NDARRAY_OK) {
            // print the sum of the squares array
            uint64_t nelem = asdf_ndarray_size(squares);
            uint64_t sum = 0;
            for (uint64_t idx = 0; idx < nelem; idx++) {
                sum += squares_data[idx];
            }
            printf("sum of squares values: %lli\n", sum);
        }
    }

    // clean up allocations
    free(squares_data);
    asdf_ndarray_destroy(squares);
    asdf_close(file);
    return 0;
 }

I think it's helpful here to show:

  • read/writing string and numeric values to the tree
  • reading and writing arrays
  • that ASDF is hierarchical (by nesting the "squares" array under "powers")

embray added a commit that referenced this pull request Jun 24, 2026
While working on gh-201 I wanted to use setting a time as a simple
example but exercising the time extension more revealed further
problems:

- parsing doesn't work for jyear format times -- this is now added,
  as well as for the decimalyear format; leaving more formats to
  support for a follow-up issue

- corrected not-quite-right handling of time values given as numerical
  scalars; rather than parsing their numerical values out of YAML and
  then re-formatting them, just take the YAML scalar verbatim, as the
  individual format parsers expect

- fixed deserialization of the optional 'scale' property, which
  previously always defaulted to 'utc'.  This was a certainly intended
  as a TODO but forgot to note it
embray added 7 commits June 24, 2026 14:35
- fixes asdf-formatgh-196, with better guard against empty history entries
- improves the tile cut-out example, demonstrating how to use
  asdf_value_find, ndarray->ndim and ndarray->shape, making the
  example more robust to arbitrary ASDF files (so long as they
  have an ndarray)
- better document that the program *can* be run on cube.asdf from
  the source code test fixtures and the expected output
and properly report failing doc example tests.  This happened to work on
the cmake version of this test since it doesn't rely on the generated
wrapper script.

asdf-formatgh-196
The new example programs give the user the opportunity to both write out
an ASDF file from scratch, then read data back in from that same file
(rather than relying on some ill-defined input file from somewhere).
The examples are shorter and, I think, speak more for themselves in
what's going on.  It is more "tutorial-like" and might give an
interested user a starting point to build on.

The old example is kept but moved to a new "Examples" page in the docs,
with an invitation to expand on it.

Took this as an opportunity to make improvements to the "doctest"
extraction and running process, since it was really kind of necessary to
do a little better for this.  Now a doctest can be marked explicitly
with a `:test:` option, and can also be passed a particular file it
needs to be run with in the test suite via `:fixture:`.

This framework is interesting and could use even further enhancements
maybe as a stand-alone sphinx extension down the line...
- avoid writing time with JXXXX year format and format: jyear
  as this triggers a bug in Python ASDF when trying to read the
  file (should be addressed in a follow-up)

- change example observer name to avoid possible confusion (Dennis
  Richie does not have any telescopes named after him that I know of)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Running cube example in readme can crash on some files

2 participants