Fix: gate FortiOS deploy-config CLI fallback on rejected commands#3466
Merged
Conversation
The deploy-config task applies configuration through the config-script API with an `ansible.builtin.expect` SSH fallback. `expect` reports success whenever the SSH session closes cleanly, which it does even when FortiOS rejected every command, so a malformed configuration was silently accepted: the play went green while the device was left partially configured. Gate the fallback by scanning its transcript for FortiOS error markers and failing the task when any are present. The check is deliberately on the CLI transcript rather than on the API result. The fallback exists because the config-script API returns HTTP 500 on some FortiOS releases (e.g. 7.0.17) even for valid configuration, so a 500 is not a reliable signal of a bad configuration; treating it as fatal would regress those releases to a hard failure. Scanning the transcript keeps that workaround intact (valid configuration applied via CLI produces no error markers) while still catching genuine rejections. The markers are anchored to the start of a line, because the transcript echoes the input configuration on prompt-prefixed lines; anchoring prevents an echoed value that merely contains the text from tripping the check. The optional numeric prefix tolerates FortiOS's `NNNN:` error form so a prefixed error at line start still matches. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I assume, deploy-config should fail on a malformed config deployment
rather then silently apply it partially.
Problem
netsim/ansible/tasks/deploy-config/fortios.ymlapplies device configurationthrough the config-script API (
fortios_monitor: upload.system.config-script)with an
ansible.builtin.expectSSH fallback in arescueblock.expectreports success whenever the SSH session closes cleanly — which itdoes even when FortiOS rejected every command. So a malformed
configuration was silently accepted: the API call failed, the rescue pasted
the configuration over SSH, FortiOS rejected the bad lines, the session closed
with
rc 0, and the play reported success. The device was left partiallyconfigured (the valid lines applied) while
netlabreported[CREATED]andexited 0.
This was latent until
pexpectwas installed on the control host; without itthe rescue merely crashed on the missing import, which accidentally surfaced
the failure.
Fix
Gate the fallback on its own transcript:
registertheexpectresult andfailed_whenit if the transcript contains a FortiOS error marker (or the SSHcommand itself failed).
Why gate the CLI transcript rather than the API result
The fallback exists (PR #2864) because the config-script API returns HTTP
500 even for valid configuration on some FortiOS releases (e.g. the 7.0.17
build used to build the boxes). A 500 is therefore not a reliable signal of
a bad configuration; treating it as fatal would regress those releases to a
hard failure. Scanning the CLI transcript instead keeps the workaround intact
— valid configuration applied via CLI produces no error markers — while still
catching genuine rejections regardless of FortiOS version.
Marker design
Command fail. Return code,Unknown action,command parse error); a lowercase pattern would need(?i), which only widens the false-positive surface.^,multiline=true). The transcript echoes theinput configuration on prompt-prefixed lines (
dut (ospf) # set …), soanchoring stops an echoed value that merely contains the marker text from
tripping the check.
\d+:prefix. FortiOS also emits errors as8610: Unknown action; the optional numeric prefix lets such a line match at line start.(A naive
^anchor would have missed it — a false negative, the dangerousdirection.)
Command fail. Return code -Naccompanies mostrejections, but a bare
Unknown action 0can appear without it (asetwhose parent
configalready failed).Test evidence (live, FortiOS v8.0.0,
netlab config)[CREATED](silent success)failed_when_result: true,FatalErrorRegex also validated offline against the captured transcript: it matches all
four observed error forms (
command parse error,Command fail. Return code,bare
Unknown action, prefixed8610: Unknown action) and rejects benignechoed descriptions containing the exact mixed-case marker text.
yamllintclean.