commit-graph: use timestamp_t for max parent generation accumulator#2148
Open
newren wants to merge 1 commit into
Open
commit-graph: use timestamp_t for max parent generation accumulator#2148newren wants to merge 1 commit into
newren wants to merge 1 commit into
Conversation
bdd1ae5 to
83f51ca
Compare
compute_reachable_generation_numbers() computes each commit's
generation as
max(c->date, max(parent.generation)) + 1
by walking its parents and accumulating their generations into a
local
uint32_t max_gen = 0;
while info->get_generation() returns timestamp_t and
compute_generation_from_max() already takes its max_gen parameter
as timestamp_t. For v1 (topological levels) the narrowing is
harmless because GENERATION_NUMBER_V1_MAX is less than 2^30, but
for v2 (corrected committer dates) it silently truncates any
parent generation that does not fit in 32 bits, i.e. any parent
whose committer timestamp is at or beyond 2106-02-07 UTC
(>= 2^32).
The truncated max then causes child commits to end up with a
corrected committer date that matches the parent's instead of being
at least 1 higher. The bad value gets written into the commit-graph
and causes problems later, and can be noticed by running `git
commit-graph verify`.
Widen the accumulator to timestamp_t.
This is solely an in-memory arithmetic fix with no on-disk format
change: the on-disk format already encodes timestamp_t values and
existing readers handle them unchanged. This merely allows the code to
compute the correct value to write to disk.
The narrowing was introduced in 80c928d (commit-graph:
simplify compute_generation_numbers(), 2023-03-20), which rewired
v2 to use the shared compute_reachable_generation_numbers()
helper; the helper's local accumulator had been declared uint32_t
in the immediately preceding 368d19b (commit-graph: refactor
compute_topological_levels(), 2023-03-20) when only v1 was using
it, where it was harmless.
Add a new test with a future-dated parent and a present-day child;
without the above fix, `git commit-graph verify` reports the
descendant's stored generation as below parent + 1.
Signed-off-by: Elijah Newren <newren@gmail.com>
83f51ca to
d063a77
Compare
Author
|
/submit |
|
Submitted as pull.2148.git.1781420271100.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
|
Patrick Steinhardt wrote on the Git mailing list (how to reply to this email): On Sun, Jun 14, 2026 at 06:57:50AM +0000, Elijah Newren via GitGitGadget wrote:
> commit-graph: use timestamp_t for max parent generation accumulator
>
> We found a few repositories in the wild with commits whose authors were
> apparently on a computer in the year 2120 when they recorded their
> commits. Apparently, in a century from now, some folks are going to have
> a really weird timezone as well (-13068837), though the timezone doesn't
> factor into this patch at all.
I'd really be curious which other parts of Git will start to break once
we cross that threshold. Would it make sense if we maybe expanded our
linux-TEST-VARS job to create commits with a date beyond UINT32_MAX?
Something like the patch at the end of this mail. And yes, many tests
break with the patch applied. From all I've seen though many of those
failures are benign, even though I'd bet that there might even be some
"proper" failures in there.
Anyway, this is of course outside the scope of this patch series.
> diff --git a/commit-graph.c b/commit-graph.c
> index 9abe62bd5a..4b7156fd76 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -1669,7 +1669,7 @@ static void compute_reachable_generation_numbers(
> struct commit *current = list->item;
> struct commit_list *parent;
> int all_parents_computed = 1;
> - uint32_t max_gen = 0;
> + timestamp_t max_gen = 0;
>
> for (parent = current->parents; parent; parent = parent->next) {
> repo_parse_commit(info->r, parent->item);
This looks obviously correct.
> diff --git a/t/t5328-commit-graph-64bit-time.sh b/t/t5328-commit-graph-64bit-time.sh
> index d8891e6a92..bc651b69de 100755
> --- a/t/t5328-commit-graph-64bit-time.sh
> +++ b/t/t5328-commit-graph-64bit-time.sh
> @@ -74,6 +74,15 @@ test_expect_success 'single commit with generation data exceeding UINT32_MAX' '
> git -C repo-uint32-max commit-graph verify
> '
>
> +test_expect_success 'descendant of commit with date exceeding UINT32_MAX' '
> + git init repo-uint32-max-descendant &&
> + test_commit -C repo-uint32-max-descendant \
> + --date "@4294967300 +0000" future-parent &&
> + test_commit -C repo-uint32-max-descendant present-day-child &&
> + git -C repo-uint32-max-descendant commit-graph write --reachable &&
> + git -C repo-uint32-max-descendant commit-graph verify
> +'
Makes sense. Thanks!
Patrick
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 809c662124..e78902b671 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -136,12 +136,19 @@ sane_unset () {
test_tick () {
if test -z "${test_tick+set}"
then
- test_tick=1112911993
+ if test_bool_env GIT_TEST_FUTURE false
+ then
+ test_tick=4294697600
+ test_tick_prefix=@
+ else
+ test_tick=1112911993
+ test_tick_prefix=
+ fi
else
test_tick=$(($test_tick + 60))
fi
- GIT_COMMITTER_DATE="$test_tick -0700"
- GIT_AUTHOR_DATE="$test_tick -0700"
+ GIT_COMMITTER_DATE="$test_tick_prefix$test_tick -0700"
+ GIT_AUTHOR_DATE="$test_tick_prefix$test_tick -0700"
export GIT_COMMITTER_DATE GIT_AUTHOR_DATE
}
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 4a7357b547..54798fb3f1 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -558,12 +558,26 @@ TEST_AUTHOR_LOCALNAME=author
TEST_AUTHOR_DOMAIN=example.com
GIT_AUTHOR_EMAIL=${TEST_AUTHOR_LOCALNAME}@${TEST_AUTHOR_DOMAIN}
GIT_AUTHOR_NAME='A U Thor'
-GIT_AUTHOR_DATE='1112354055 +0200'
TEST_COMMITTER_LOCALNAME=committer
TEST_COMMITTER_DOMAIN=example.com
GIT_COMMITTER_EMAIL=${TEST_COMMITTER_LOCALNAME}@${TEST_COMMITTER_DOMAIN}
GIT_COMMITTER_NAME='C O Mitter'
-GIT_COMMITTER_DATE='1112354055 +0200'
+
+case "${GIT_TEST_FUTURE:-false}" in
+1|on|true|yes)
+ GIT_AUTHOR_DATE="${GIT_TEST_DATE:-@4294697300 +0200}"
+ GIT_COMMITTER_DATE="${GIT_TEST_DATE:-@4294697300 +0200}"
+ ;;
+0|off|false|no)
+ GIT_AUTHOR_DATE="${GIT_TEST_DATE:-1112354055 +0200}"
+ GIT_COMMITTER_DATE="${GIT_TEST_DATE:-1112354055 +0200}"
+ ;;
+*)
+ echo "GIT_TEST_FUTURE requires a boolean" >&2
+ exit 1
+ ;;
+esac
+
GIT_MERGE_VERBOSITY=5
GIT_MERGE_AUTOEDIT=no
export GIT_MERGE_VERBOSITY GIT_MERGE_AUTOEDIT |
|
User |
|
Derrick Stolee wrote on the Git mailing list (how to reply to this email): On 6/15/26 4:11 AM, Patrick Steinhardt wrote:
> On Sun, Jun 14, 2026 at 06:57:50AM +0000, Elijah Newren via GitGitGadget wrote:
>> commit-graph: use timestamp_t for max parent generation accumulator
>> >> We found a few repositories in the wild with commits whose authors were
>> apparently on a computer in the year 2120 when they recorded their
>> commits. Apparently, in a century from now, some folks are going to have
>> a really weird timezone as well (-13068837), though the timezone doesn't
>> factor into this patch at all.
>> @@ -1669,7 +1669,7 @@ static void compute_reachable_generation_numbers(
>> struct commit *current = list->item;
>> struct commit_list *parent;
>> int all_parents_computed = 1;
>> - uint32_t max_gen = 0;
>> + timestamp_t max_gen = 0;
>> >> for (parent = current->parents; parent; parent = parent->next) {
>> repo_parse_commit(info->r, parent->item);
> > This looks obviously correct.
I agree. I was surprised this was the only necessary change, but
your message clearly describes how the timing of the patch that
delivered this change contributed to the mismatch.
Thanks,
-Stolee |
|
User |
|
This patch series was integrated into seen via git@cc33e45. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We found a few repositories in the wild with commits whose authors were apparently on a computer in the year 2120 when they recorded their commits. Apparently, in a century from now, some folks are going to have a really weird timezone as well (-13068837), though the timezone doesn't factor into this patch at all.
cc: Patrick Steinhardt ps@pks.im
cc: Derrick Stolee stolee@gmail.com