Skip to content

Fix: gate FortiOS deploy-config CLI fallback on rejected commands#3466

Merged
ipspace merged 1 commit into
ipspace:devfrom
a-v-popov:fos-deploy-config-gate
Jun 9, 2026
Merged

Fix: gate FortiOS deploy-config CLI fallback on rejected commands#3466
ipspace merged 1 commit into
ipspace:devfrom
a-v-popov:fos-deploy-config-gate

Conversation

@a-v-popov

Copy link
Copy Markdown
Collaborator

I assume, deploy-config should fail on a malformed config deployment
rather then silently apply it partially.

Problem

netsim/ansible/tasks/deploy-config/fortios.yml applies device configuration
through the config-script API (fortios_monitor: upload.system.config-script)
with an ansible.builtin.expect SSH fallback in a rescue block.

expect reports success whenever the SSH session closes cleanly — which it
does even when FortiOS rejected every command. So a malformed
configuration was silently accepted: the API call failed, the rescue pasted
the configuration over SSH, FortiOS rejected the bad lines, the session closed
with rc 0, and the play reported success. The device was left partially
configured (the valid lines applied) while netlab reported [CREATED] and
exited 0.

This was latent until pexpect was installed on the control host; without it
the rescue merely crashed on the missing import, which accidentally surfaced
the failure.

Fix

Gate the fallback on its own transcript: register the expect result and
failed_when it if the transcript contains a FortiOS error marker (or the SSH
command itself failed).

    register: fortios_cli_result
    vars:
      fos_config_error: '^(\d+: )?(Command fail\. Return code|Unknown action|command parse error)'
    failed_when: >-
      fortios_cli_result.rc != 0 or
      fortios_cli_result.stdout is search(fos_config_error, multiline=true)

Why gate the CLI transcript rather than the API result

The fallback exists (PR #2864) because the config-script API returns HTTP
500 even for valid configuration on some FortiOS releases
(e.g. the 7.0.17
build used to build the boxes). A 500 is therefore not a reliable signal of
a bad configuration; treating it as fatal would regress those releases to a
hard failure. Scanning the CLI transcript instead keeps the workaround intact
— valid configuration applied via CLI produces no error markers — while still
catching genuine rejections regardless of FortiOS version.

Marker design

  • Case-sensitive. The strings are mixed-case (Command fail. Return code,
    Unknown action, command parse error); a lowercase pattern would need
    (?i), which only widens the false-positive surface.
  • Anchored to line start (^, multiline=true). The transcript echoes the
    input configuration on prompt-prefixed lines (dut (ospf) # set …), so
    anchoring stops an echoed value that merely contains the marker text from
    tripping the check.
  • Optional \d+: prefix. FortiOS also emits errors as 8610: Unknown action; the optional numeric prefix lets such a line match at line start.
    (A naive ^ anchor would have missed it — a false negative, the dangerous
    direction.)
  • All three markers retained: Command fail. Return code -N accompanies most
    rejections, but a bare Unknown action 0 can appear without it (a set
    whose parent config already failed).

Test evidence (live, FortiOS v8.0.0, netlab config)

Scenario Path exercised Before After
Malformed config API 500 → CLI rescue exit 0, [CREATED] (silent success) exit 1, failed_when_result: true, FatalError
Valid config API 200 exit 0 exit 0 (unchanged)
Valid config, API forced to fail CLI rescue n/a exit 0 (workaround preserved, no false positive)

Regex also validated offline against the captured transcript: it matches all
four observed error forms (command parse error, Command fail. Return code,
bare Unknown action, prefixed 8610: Unknown action) and rejects benign
echoed descriptions containing the exact mixed-case marker text.

yamllint clean.

The deploy-config task applies configuration through the config-script API
with an `ansible.builtin.expect` SSH fallback. `expect` reports success
whenever the SSH session closes cleanly, which it does even when FortiOS
rejected every command, so a malformed configuration was silently accepted:
the play went green while the device was left partially configured.

Gate the fallback by scanning its transcript for FortiOS error markers and
failing the task when any are present.

The check is deliberately on the CLI transcript rather than on the API
result. The fallback exists because the config-script API returns HTTP 500
on some FortiOS releases (e.g. 7.0.17) even for valid configuration, so a
500 is not a reliable signal of a bad configuration; treating it as fatal
would regress those releases to a hard failure. Scanning the transcript
keeps that workaround intact (valid configuration applied via CLI produces
no error markers) while still catching genuine rejections.

The markers are anchored to the start of a line, because the transcript
echoes the input configuration on prompt-prefixed lines; anchoring prevents
an echoed value that merely contains the text from tripping the check. The
optional numeric prefix tolerates FortiOS's `NNNN:` error form so a prefixed
error at line start still matches.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@ipspace ipspace left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a million!

@ipspace ipspace merged commit e88b43a into ipspace:dev Jun 9, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants