Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions object-file.c
Original file line number Diff line number Diff line change
Expand Up @@ -561,9 +561,9 @@ int odb_source_loose_read_object_info(struct odb_source *source,
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patrick Steinhardt wrote on the Git mailing list (how to reply to this email):

On Thu, Jun 04, 2026 at 05:15:09PM +0000, Philip Oakley via GitGitGadget wrote:
> diff --git a/object-file.c b/object-file.c
> index 1f5f9daf24..c648cecd80 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -581,7 +581,7 @@ static void write_object_file_prepare(const struct git_hash_algo *algo,
>  	/* Generate the header */
>  	*hdrlen = format_object_header(hdr, *hdrlen, type, len);
>  
> -	/* Sha1.. */
> +	/* Hash (function pointers) computation */
>  	hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen);
>  }
>  

Thanks for updating this comment while at it :)

> diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
> index 7867fd1dbf..10382a815e 100755
> --- a/t/t1007-hash-object.sh
> +++ b/t/t1007-hash-object.sh
> @@ -261,7 +261,7 @@ test_expect_success '--stdin outside of repository (uses default hash)' '
>  	test_cmp expect actual
>  '
>  
> -test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
>  		'files over 4GB hash literally' '
>  	test-tool genzeros $((5*1024*1024*1024)) >big &&
>  	test_oid large5GB >expect &&

Previously we required `!LONG_IS_64BIT`, because the test would have
succeeded on platforms where it is 64 bit wide. But now that this test
works on all platforms I rather wonder whether we should completely drop
that prerequisite here, as we expect it to pass regardless of whether or
not long is 64 bit now.

Patrick


static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patrick Steinhardt wrote on the Git mailing list (how to reply to this email):

On Thu, Jun 04, 2026 at 05:15:08PM +0000, Philip Oakley via GitGitGadget wrote:
> From: Philip Oakley <philipoakley@iee.email>
> 
> Continue walking the code path for the >4GB `hash-object --literally`
> test. The `hash_object_file_literally()` function internally uses both
> `hash_object_file()` and `write_object_file_prepare()`. Both function
> signatures use `unsigned long` rather than `size_t` for the mem buffer
> sizes. Use `size_t` instead, for LLP64 compatibility.
> 
> While at it, convert those function's object's header buffer length to
> `size_t` for consistency. The value is already upcast to `uintmax_t` for
> print format compatibility.

One thing I was wondering is whether we should rather migrate to a size
that is consistent across different platforms. We could e.g. `typedef
uint64_t objsize_t` and then use that going forward.

I guess the question though is whether that'd buy us anything. In other
words, are there any platforms that we care about where `size_t` is only
32 bit wide? And would such platforms even be able to handle such large
objects?

Patrick

const void *buf, unsigned long len,
const void *buf, size_t len,
struct object_id *oid,
char *hdr, int *hdrlen)
char *hdr, size_t *hdrlen)
{
algo->init_fn(c);
git_hash_update(c, hdr, *hdrlen);
Expand All @@ -572,16 +572,16 @@ static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_c
}

static void write_object_file_prepare(const struct git_hash_algo *algo,
const void *buf, unsigned long len,
const void *buf, size_t len,
enum object_type type, struct object_id *oid,
char *hdr, int *hdrlen)
char *hdr, size_t *hdrlen)
{
struct git_hash_ctx c;

/* Generate the header */
*hdrlen = format_object_header(hdr, *hdrlen, type, len);

/* Sha1.. */
/* Hash (function pointers) computation */
hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen);
}

Expand Down Expand Up @@ -717,11 +717,11 @@ int finalize_object_file_flags(struct repository *repo,
}

void hash_object_file(const struct git_hash_algo *algo, const void *buf,
unsigned long len, enum object_type type,
size_t len, enum object_type type,
struct object_id *oid)
{
char hdr[MAX_HEADER_LEN];
int hdrlen = sizeof(hdr);
size_t hdrlen = sizeof(hdr);

write_object_file_prepare(algo, buf, len, type, oid, hdr, &hdrlen);
}
Expand Down Expand Up @@ -1177,7 +1177,7 @@ int odb_source_loose_write_stream(struct odb_source *source,
}

int odb_source_loose_write_object(struct odb_source *source,
const void *buf, unsigned long len,
const void *buf, size_t len,
enum object_type type, struct object_id *oid,
struct object_id *compat_oid_in,
enum odb_write_object_flags flags)
Expand All @@ -1186,7 +1186,7 @@ int odb_source_loose_write_object(struct odb_source *source,
const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
struct object_id compat_oid;
char hdr[MAX_HEADER_LEN];
int hdrlen = sizeof(hdr);
size_t hdrlen = sizeof(hdr);

/* Generate compat_oid */
if (compat) {
Expand Down
4 changes: 2 additions & 2 deletions object-file.h
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ int odb_source_loose_freshen_object(struct odb_source *source,
const struct object_id *oid);

int odb_source_loose_write_object(struct odb_source *source,
const void *buf, unsigned long len,
const void *buf, size_t len,
enum object_type type, struct object_id *oid,
struct object_id *compat_oid_in,
enum odb_write_object_flags flags);
Expand Down Expand Up @@ -201,7 +201,7 @@ int finalize_object_file_flags(struct repository *repo,
enum finalize_object_file_flags flags);

void hash_object_file(const struct git_hash_algo *algo, const void *buf,
unsigned long len, enum object_type type,
size_t len, enum object_type type,
struct object_id *oid);

/* Helper to check and "touch" a file */
Expand Down
3 changes: 1 addition & 2 deletions sha1dc_git.c
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,9 @@ void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx)
/*
* Same as SHA1DCUpdate, but adjust types to match git's usual interface.
*/
void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, unsigned long len)
void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, size_t len)
{
const char *data = vdata;
/* We expect an unsigned long, but sha1dc only takes an int */
while (len > INT_MAX) {
SHA1DCUpdate(ctx, data, INT_MAX);
data += INT_MAX;
Expand Down
2 changes: 1 addition & 1 deletion sha1dc_git.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ void git_SHA1DCInit(SHA1_CTX *);
#endif

void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *);
void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len);
void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, size_t len);

#define platform_SHA_IS_SHA1DC /* used by "test-tool sha1-is-sha1dc" */

Expand Down
39 changes: 39 additions & 0 deletions t/t1007-hash-object.sh
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,9 @@ test_expect_success 'setup' '

example sha1:ddd3f836d3e3fbb7ae289aa9ae83536f76956399
example sha256:b44fe1fe65589848253737db859bd490453510719d7424daab03daf0767b85ae

large5GB sha1:0be2be10a4c8764f32c4bf372a98edc731a4b204
large5GB sha256:dc18ca621300c8d3cfa505a275641ebab00de189859e022a975056882d313e64
EOF
'

Expand Down Expand Up @@ -258,4 +261,40 @@ test_expect_success '--stdin outside of repository (uses default hash)' '
test_cmp expect actual
'

test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
'files over 4GB hash literally' '
test-tool genzeros $((5*1024*1024*1024)) >big &&
test_oid large5GB >expect &&
git hash-object --stdin --literally <big >actual &&
test_cmp expect actual

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patrick Steinhardt wrote on the Git mailing list (how to reply to this email):

On Thu, Jun 04, 2026 at 05:15:10PM +0000, Philip Oakley via GitGitGadget wrote:
> diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
> index 10382a815e..59efee3aff 100755
> --- a/t/t1007-hash-object.sh
> +++ b/t/t1007-hash-object.sh
> @@ -269,4 +269,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
>  	test_cmp expect actual
>  '
>  
> +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> +		'files over 4GB hash correctly via --stdin' '
> +	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
> +	test_oid large5GB >expect &&
> +	git hash-object --stdin <big >actual &&
> +	test_cmp expect actual
> +'

Same comment here: can we drop the `!LONG_IS_64BIT` prereq?

Patrick

'

test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
'files over 4GB hash correctly via --stdin' '
{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
test_oid large5GB >expect &&
git hash-object --stdin <big >actual &&
test_cmp expect actual

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patrick Steinhardt wrote on the Git mailing list (how to reply to this email):

On Thu, Jun 04, 2026 at 05:15:11PM +0000, Philip Oakley via GitGitGadget wrote:
> diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
> index 59efee3aff..f2722380ee 100755
> --- a/t/t1007-hash-object.sh
> +++ b/t/t1007-hash-object.sh
> @@ -277,4 +277,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
>  	test_cmp expect actual
>  '
>  
> +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> +		'files over 4GB hash correctly' '
> +	{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
> +	test_oid large5GB >expect &&
> +	git hash-object -- big >actual &&
> +	test_cmp expect actual
> +'

Same comment here.

Nit: I feel like we could've easily introduced all of these tests in the
first commit.

Patrick

'

test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
'files over 4GB hash correctly' '
{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
test_oid large5GB >expect &&
git hash-object -- big >actual &&
test_cmp expect actual
'

# This clean filter does nothing, other than excercising the interface.
# We ensure that cleaning doesn't mangle large files on 64-bit Windows.
test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
'hash filtered files over 4GB correctly' '
{ test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } &&
test_oid large5GB >expect &&
test_config filter.null-filter.clean "cat" &&
echo "big filter=null-filter" >.gitattributes &&
git hash-object -- big >actual &&
test_cmp expect actual
'

test_done
Loading