Reduce the memory usage that is important for ne1024 simulation#414
Conversation
billsacks
left a comment
There was a problem hiding this comment.
This seems like a useful feature.
However: Please change the ordering of arguments so that rc remains the last argument in the subroutine definitions - i.e., putting the new stream_mesh_in argument before the rc argument. Please do this for both (1) shr_strdata_init (this is private to this module so the argument order for any callers can easily be changed if needed), and (2) shr_strdata_init_from_inline. (2) is public so in theory could impact callers in other repositories; however, this seems pretty unlikely since there are already optional arguments appearing before rc, so presumably callers are already using named arguments at that point in the call... but this should be verified by doing builds of at least (a) an A compset (all data models) and (b) a B compset (e.g., B1850C_LTso).
|
Thanks @billsacks for your comment. I just made your suggested changes. One thing I want to point out, putting |
|
Thank you for making that change, @sjsprecious .
I wondered about that, too. But I noticed that there was already an optional argument before rc (stream_name), so I figured this was no worse, and was keeping things consistent. Do you have a reference for this Fortran convention stating that optional arguments should all come after mandatory arguments? I do see that as more intuitive, but from what I can tell, it isn't required... but, from a quick search, I haven't been able to find something definitive one way or the other. Of course, if a required argument comes after optional arguments, you will generally need to use name=value in the caller, but I find that to be a good practice anyway. What testing have you done on the latest version? If you haven't already, can you please at least test the build for an A compset and a B compset? |
|
Thanks @billsacks . Here is ECMWF IFS Fortran coding standard where it states "Optional arguments to a routine shall be after non-optional ones" (https://sites.ecmwf.int/docs/ifs-arpege-coding-standards/fortran/rules/L6.html). However, my understanding is that this is a convention, not a hard requirement. Thus as long as the caller is using the named argument, the current changes should be fine. But if we rely on the caller to use named argument, then your suggested change to put I have run the FHISTC_LTso compset successfully with these changes. Can you share the detailed instructions about building an A compset and a B compset on Derecho? I am happy to test them afterwards. |
|
Thanks @sjsprecious for these changes. I would recommend passing mapalgo as an optional parameter instead of a duplicate mesh_in parameter. If the user explicitly sets mapalgo = 'redist', that algorithm choice should dictate the subroutine's behavior. Passing an optional mesh_in parameter or relying on the lower-level routines to infer what to do based on whether a mesh file string is populated is confusing. If the model is performing a 1-to-1 index redistribution, the routine doesn't not need a mesh parameter at all since the model is just doing an array copy. I'm also concerned about error checking. I assume the redistribution will fail if the model mesh is a different size from the offline file for the redist option. I'm not sure if the corrective action will be apparent from the existing error. Maybe just verifying the dimensions of the model and mesh file and erroring out with a message letting the user know that when mapalgo='redist' is set the grid dimensions of the model and mesh file must match. |
|
Thanks @jtruesdal for commenting on the interface change and error checking. I have made the code changes you suggest at af456c0 and let me know if it looks better to you now. |
|
@mvertens - I'd be very happy to have your eyes on this as well, if you have time... if you don't have time, I at least want you to be aware of it. |
billsacks
left a comment
There was a problem hiding this comment.
@jtruesdal 's points were very good. @sjsprecious thank you for addressing them. The code looks good as far as I can tell, but I'm also requesting a review from @mvertens .
I do have a few requests for testing / documentation before this is merged:
(1) The new change seems better than the original version, but now has the potential to impact more configurations, so I'd like to see additional testing before this is merged. I'm thinking at least aux_cdeps with baseline comparisons and aux_cime_baselines with baseline comparisons (either proactively or keeping a careful eye on it the nightly testing after this is merged). @sjsprecious - if you are unclear on how to do this, can you please work with @fischer-ncar ? @fischer-ncar - also please feel free to give more guidance on what testing would be appropriate here.
(2) I also want to check: how confident are you that the assignment stream_mesh = sdat%model_mesh is safe and correct when the decomposition of the stream differs from the decomposition of the model - i.e., when the redist actually has something to do? (I'm not clear on whether the mesh returned from ESMF_MeshCreate is tied to the decomposition.) If you haven't already done so, can you verify that this is safe through some combination of research and testing (with a test case that covers this situation where the decompositions differ)?
(3) Now that the implementation differs from your original, please change the top-level PR comment accordingly.
| if (trim(sdat%stream(ns)%mapalgo) == 'redist' .and. filename /= 'none') then | ||
| ! A 'redist' stream is on the same grid as the model: the stream->model mapping is a 1-to-1 index copy, | ||
| ! so the stream mesh and the model mesh are identical. Reuse the already-built model mesh instead of reading | ||
| ! the stream mesh file and constructing a duplicate full ESMF mesh -- that duplicate is a large init memory | ||
| ! and time cost at high resolution. | ||
| ! Only stream_mesh is read below (no MeshSet/MeshDestroy), so sharing the model-mesh handle is safe. | ||
| ! First verify the grids really match (redist requires identical global sizes) and error clearly if not. | ||
| call shr_strdata_check_redist_size(sdat, ns, trim(filename), rc=rc) | ||
| if (ChkErr(rc,__LINE__,u_FILE_u)) return | ||
| stream_mesh = sdat%model_mesh |
There was a problem hiding this comment.
@mvertens - I especially would welcome your input on whether the assignment of stream_mesh = sdat%model_mesh seems general and safe here.
There was a problem hiding this comment.
This looks okay to me. There is already a copy being done in the inline call - so this is should be fine as well.
There was a problem hiding this comment.
@sjsprecious - I talked with @mvertens about this. She has convinced me that it's fine. However, the reason this is safe is potentially subtle, and I think warrants an additional comment - which could add to the comment above (or possibly replace some of the text in the existing comment). Can you add something like this, if this sounds good / correct to you?
We call this a redist, but in fact the model decomposition is the same as the stream decomposition, so the redist ends up simply being a copy. This makes it safe to set the stream mesh equal to the model mesh - since both the underlying grids/meshes and their decompositions are the same.
I'm reaching out to the ESMF group to get their take on this but I also welcome your thoughts and would be happy to see a test that confirms that this works - though I'm not sure myself what configuration would exercise this. |
|
Thanks @billsacks .
Happy to work with @fischer-ncar on these tests. While I am working on other PRs for similar memory fixes, I noticed that I did not have permission to access the baseline on Derecho. Can you and someone grant me the access before @fischer-ncar shares more details about the test setups?
I have no ideas about this part. If you have any suggested tests or hear back from ESMF team, I am happy to do them on Derecho for confirmation.
Done. |
Are you having trouble with read permission or write permission? Spot-checking some baselines, it looks like they should have world read permission. If there are specific directories that you're having trouble accessing, @fischer-ncar can hopefully help with that. If the issue is writing: in general, writing one-off baselines should go somewhere else. (When I need to do both reads and writes of baselines, I create my own baseline directory in scratch space, sym link from there to the pre-existing baselines I want to compare against, and then point to that directory in my space for baseline generation & comparison.)
Okay. This feels like an important thing to confirm, either from someone who has a deep enough understanding of the workings of these mesh structures or through figuring out a test that can be done. I want to hold off on bringing this in until we get confirmation through at least one of those means. |
For my CTSM PR, I did not have read access to the directory
Thanks @billsacks. I understand your concern and hopefully we can find the answer soon. |
|
Non-CTSM baselines are here: /glade/campaign/cesm/cesmdata/cseg/cesm_baselines I'm not sure about aux_cime_baselines |
| if (trim(sdat%stream(ns)%mapalgo) == 'redist' .and. filename /= 'none') then | ||
| ! A 'redist' stream is on the same grid as the model: the stream->model mapping is a 1-to-1 index copy, | ||
| ! so the stream mesh and the model mesh are identical. Reuse the already-built model mesh instead of reading | ||
| ! the stream mesh file and constructing a duplicate full ESMF mesh -- that duplicate is a large init memory | ||
| ! and time cost at high resolution. | ||
| ! Only stream_mesh is read below (no MeshSet/MeshDestroy), so sharing the model-mesh handle is safe. | ||
| ! First verify the grids really match (redist requires identical global sizes) and error clearly if not. | ||
| call shr_strdata_check_redist_size(sdat, ns, trim(filename), rc=rc) | ||
| if (ChkErr(rc,__LINE__,u_FILE_u)) return | ||
| stream_mesh = sdat%model_mesh |
There was a problem hiding this comment.
This looks okay to me. There is already a copy being done in the inline call - so this is should be fine as well.
|
Thanks the changes look good to me too. |
|
The aux_cime_baselines are in /glade/campaign/cesm/cesmdata/cesm_baselines/nightly_current. But they are updated nightly with updated externals. So it'll be hard to reproduce. |
billsacks
left a comment
There was a problem hiding this comment.
Just requesting an additional comment. Besides that, this looks good once the requested testing has been done.
| if (trim(sdat%stream(ns)%mapalgo) == 'redist' .and. filename /= 'none') then | ||
| ! A 'redist' stream is on the same grid as the model: the stream->model mapping is a 1-to-1 index copy, | ||
| ! so the stream mesh and the model mesh are identical. Reuse the already-built model mesh instead of reading | ||
| ! the stream mesh file and constructing a duplicate full ESMF mesh -- that duplicate is a large init memory | ||
| ! and time cost at high resolution. | ||
| ! Only stream_mesh is read below (no MeshSet/MeshDestroy), so sharing the model-mesh handle is safe. | ||
| ! First verify the grids really match (redist requires identical global sizes) and error clearly if not. | ||
| call shr_strdata_check_redist_size(sdat, ns, trim(filename), rc=rc) | ||
| if (ChkErr(rc,__LINE__,u_FILE_u)) return | ||
| stream_mesh = sdat%model_mesh |
There was a problem hiding this comment.
@sjsprecious - I talked with @mvertens about this. She has convinced me that it's fine. However, the reason this is safe is potentially subtle, and I think warrants an additional comment - which could add to the comment above (or possibly replace some of the text in the existing comment). Can you add something like this, if this sounds good / correct to you?
We call this a redist, but in fact the model decomposition is the same as the stream decomposition, so the redist ends up simply being a copy. This makes it safe to set the stream mesh equal to the model mesh - since both the underlying grids/meshes and their decompositions are the same.
|
@sjsprecious I'll run the aux_cdeps and aux_cime_baselines tests for you. I'll let you know the results once they finished. |
That is awesome! Thanks @fischer-ncar for your time and help. |
Thanks @billsacks . I have updated the comment based on your suggsetion. |
billsacks
left a comment
There was a problem hiding this comment.
I like the way you worked this comment in - thank you!
So I feel this is ready once @fischer-ncar reports back on the testing.
|
The aux_cdeps and aux_cime_baselines expected passing tests all passed. So looks good. |
Thanks @fischer-ncar for doing the tests so quickly and confirming the results. |
This PR introduces some changes in CDEPS that will be used in CTSM later and are critical to reduce memory usage of a simulation at ne1024 resolution. All the changes are done by Claude under my supervisory.
The goal is to avoid the construction of a duplicate ESMF mesh for streams whose data already live on the model grid. When a stream uses
mapalgo = 'redist'the stream→model mapping is a 1-to-1 index copy, so the stream mesh and the model mesh are necessarily identical. Previouslyshr_strdata_initalways read the stream mesh file and calledESMF_MeshCreateto build a second, full ESMF mesh — atne1024resolution that duplicate (NetCDF mesh-file read + ESMF mesh build) is a large initialization memory and time cost.All edits are in
streams/dshr_strdata_mod.F90:rediststreams. At the mesh-creation site inshr_strdata_init, whenmapalgo == 'redist'(and the stream has a mesh file) the stream mesh is set to the already-built model mesh (sdat%model_mesh) instead of building a new one from file:This is a handle copy (the same assignment already used for the file-built mesh), so no second ESMF mesh is constructed. The per-stream
stream_meshis only ever read downstream (there is noESMF_MeshSet/ESMF_MeshDestroyanywhere in the streams code), so sharing the model-mesh handle is safe — it is never mutated or freed through the stream slot.shr_strdata_check_redist_size.redistrequires the stream and model grids to have the same global size. The new routine opens the stream mesh file and reads only itselementCountdimension — cheap metadata, not the full mesh, so the memory/time saving is preserved — and compares it to the model mesh global size (sdat%model_gsize). On a mismatch it aborts with a clear message stating thatmapalgo='redist'requires matching grids and to use a non-redist mapalgo (consf/consd/nn/bilinear) for a different stream grid. If the mesh file is not in ESMF format (noelementCountdimension) the check is skipped and any problem is left to the downstream read. The PIO calls mirror the existingshr_strdata_get_stream_nlevidiom and are collective over the PIO subsystem; every rank holds the samemodel_gsizeand reads the sameelementCount, so the comparison is identical on all ranks.