verifier/tpm: share a pooled registrar client and cache bindings#187
Open
jialez0 wants to merge 1 commit into
Open
verifier/tpm: share a pooled registrar client and cache bindings#187jialez0 wants to merge 1 commit into
jialez0 wants to merge 1 commit into
Conversation
Collaborator
|
@jialez0 ,您好,您的请求已接收,请耐心等待结果。 |
Collaborator
|
@jialez0 ,您好,未检测到有镜像需要构建,如需重新检测请评论 /start 。 |
2ae96aa to
1762b95
Compare
Collaborator
|
@jialez0 ,您好,您的请求已接收,请耐心等待结果。 |
Collaborator
|
@jialez0 ,您好,未检测到有镜像需要构建,如需重新检测请评论 /start 。 |
The TPM and Hygon TPM verifiers performed the keylime registrar binding
check by building a fresh reqwest::Client and issuing two serial HTTPS
round-trips (/version then /v{ver}/agents/{uuid}) on every attestation.
Under load this made attestation throughput a hostage of the registrar
(a low-concurrency, DB-backed service) while the AS itself stayed idle on
CPU: a single-worker registrar at ~150ms/call capped end-to-end throughput
at ~3.3 QPS even though the pure verification path sustains 500-900 QPS.
Introduce a shared tpm_registrar module that:
* reuses one process-wide reqwest::Client so TCP/TLS is pooled instead
of re-established per request,
* caches the registrar API version per registrar URL, and
* caches the per-UUID registrar results with a TTL
(KEYLIME_REGISTRAR_CACHE_TTL_SECS, default 300s),
so repeated attestations for the same agent no longer touch the registrar.
Each verifier still compares the returned EK/AK material against the
evidence being validated, so the binding guarantee is unchanged.
No new dependencies (std OnceLock); MSRV unaffected.
Signed-off-by: Jiale Zhang <zhangjiale@linux.alibaba.com>
1762b95 to
5d14420
Compare
Collaborator
|
@jialez0 ,您好,您的请求已接收,请耐心等待结果。 |
Collaborator
|
@jialez0 ,您好,未检测到有镜像需要构建,如需重新检测请评论 /start 。 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The TPM (
tpm) and Hygon TPM (hygontpm) verifiers perform a keylimeregistrar binding check whenever the evidence carries a
keylime_agent_uuid:they confirm that the EK certificate and AK public key in the evidence match
what the registrar recorded for that agent.
The current implementation builds a brand-new
reqwest::Clientand issuestwo serial HTTPS round-trips (
/version, then/v{ver}/agents/{uuid}) tothe registrar on every attestation. Under load this makes attestation
throughput a hostage of the registrar — a low-concurrency, DB-backed service —
while the AS itself stays idle on CPU.
Measured on a 4-core box with a single-worker registrar answering in ~150 ms:
end-to-end
/attestationthroughput collapses to ~3.3 QPS with the ASprocess using only 0–13 % of the 4 cores, even though the pure verification
path (no
keylime_agent_uuid) sustains 600–900 QPS. This exactly matches alow-QPS / low-CPU report from a Hygon TPM benchmarking run.
Fix
Introduce a shared
verifier::tpm_registrarmodule used by both TPM-familyverifiers that:
reqwest::Client, so TCP connections and TLSsessions are pooled instead of re-established on every call;
resultswith a TTL(
KEYLIME_REGISTRAR_CACHE_TTL_SECS, default 300 s).Repeated attestations for the same agent therefore no longer touch the
registrar at all. Each verifier still compares the returned EK/AK material
against the evidence being validated, so the binding guarantee is unchanged;
the TTL bounds how long a stale registration can be trusted after an agent
re-registers.
The change removes the duplicated per-request HTTP logic from both
deps/verifier/src/tpm/mod.rsanddeps/verifier/src/hygon_tpm/mod.rs(net −83 lines there) and adds no new dependencies (uses
std::sync::OnceLock);MSRV is unaffected.
Results
Same 4-core box, benchmark client at concurrency 10, single-worker registrar at
~150 ms/call. "no-uuid" = pure verification path (registrar not involved).
Before the fix, throughput is independent of request count (every request pays
the ~300 ms registrar cost). After the fix, the registrar is hit once per UUID
(cold start) and every subsequent attestation is served from cache, so
throughput scales with batch size toward the pure-verification ceiling.
Under sustained cached load the AS process now uses ~320 % of the 4 cores
(CPU-bound) instead of sitting idle at 0–13 %.
The no-uuid path is unchanged within run-to-run noise, confirming no regression
to the verification hot path.