Automated Compliance Checking: NIST 800-53 Controls on Network Devices

Automated Compliance Checking: NIST 800-53 Controls on Network Devices

Turn NIST Controls Into Device Checks

Our quarterly NIST compliance audit required checking 23 controls across 150 Cisco and FortiGate devices, and it was consuming an entire engineer-week before we even got to evidence cleanup. I was sitting in our maintenance office with three SSH sessions open, a spreadsheet full of control IDs, and a growing list of screenshots that nobody wanted to review twice.

We started by translating the audit language into checks that could actually be tested on network devices. AC-17 became remote access restrictions. IA-5 became password and authentication policy validation. AU-2 and AU-12 became logging configuration and forwarding checks. CM-6 became baseline configuration enforcement. On Cisco IOS XE 17.9.4 and FortiOS 7.4.3, those controls do not live in one clean place, so I had to map each control to commands, expected values, acceptable variants, and evidence fields.

The audit got real fast.

My working rule was simple: if my team could not explain a control result from the device output alone, the automation was not ready. That kept us honest when a control looked easy on paper but depended on radius server groups, admin profiles, syslog destinations, NTP authentication, SNMPv3 settings, or local fallback behavior. I prefer controls that fail loudly over controls that pass because the parser was too optimistic.

Normalize Cisco And FortiGate Output With Python

The first automation script did not handle multi-vendor command output differences. I assumed Cisco and FortiGate would format the same data in roughly comparable ways because the control intent was the same. That was wrong. Cisco output was mostly line-oriented with nested sections, while FortiGate output often came back as configuration blocks with different object names, implicit defaults, and VDOM context hiding the setting I actually needed.

I built the second version in Python 3.11 on Ubuntu 22.04 and split collection from interpretation. Netmiko handled SSH transport for Cisco IOS XE 17.9.4. FortiGate checks used a vendor adapter that could handle FortiOS 7.4.3 command structure, paging, and VDOM scoping. The parser returned normalized facts, not pass/fail decisions, which meant the compliance engine could stay boring and predictable.

CONTROL_MAP = {
    "AU-12": {
        "name": "Audit Record Generation",
        "facts": ["syslog_servers", "logging_level", "timestamp_enabled"],
        "required": {
            "syslog_servers_min": 2,
            "logging_level": ["informational", "notice"],
            "timestamp_enabled": True
        }
    }
}

def evaluate_au_12(device_facts):
    return {
        "control": "AU-12",
        "status": "pass" if (
            len(device_facts["syslog_servers"]) >= 2
            and device_facts["logging_level"] in ["informational", "notice"]
            and device_facts["timestamp_enabled"] is True
        ) else "fail",
        "evidence": device_facts
    }

What I didn’t expect was how much value came from storing failed parse attempts. Those records showed us which device families were drifting from our assumptions, and they also exposed old switch stacks that still had legacy TACACS syntax from before our last refresh. In my environment, parser errors became a better inventory signal than our inventory system.

Normalization beat cleverness.

Collect Evidence Auditors Can Actually Use

Our auditors did not want a dashboard screenshot with a green percentage. They wanted evidence they could trace back to a device, a control, a command, a timestamp, and a responsible owner. We generated a PDF report, a CSV finding export, and a raw evidence bundle for every run. Each record included hostname, management IP, platform, OS version, control ID, command executed, normalized fact output, result, and parser confidence.

The before and after metric changed the conversation. Manual review took 3 weeks for the full 150-device sample when we included collection, peer review, spreadsheet cleanup, and audit packaging. After normalizing output parsing per vendor, the automated compliance check covers all 23 controls in 94 minutes with 97% accuracy versus manual review. The remaining 3% were not random failures; they were mostly devices with broken inventory metadata, stale credentials, or unsupported software trains.

  • Every command result was stored with a UTC timestamp and collection job ID.
  • Every failed control included the exact configuration evidence that triggered the failure.
  • Every parser warning was separated from control failure so auditors could see uncertainty clearly.
  • Every device report included Cisco IOS XE 17.9.4 or FortiOS 7.4.3 version context when available.
  • Every PDF had a hash value so we could prove the delivered report had not changed.

Auditors trust boring evidence.

I also learned to keep the report language plain. A failed AC-17 check did not say “remote access posture deviation detected.” It said SSH is enabled, Telnet is disabled, AAA points to these servers, and this device does or does not match the approved baseline. That kind of wording reduced follow-up meetings because nobody had to decode automation language before asking a useful question.

Track Exceptions Without Hiding Risk

Exceptions were the part my team underestimated. We had plant-floor network segments where a control could not be applied cleanly because vendor equipment depended on older management behavior. Some devices supported SNMPv3 but had monitoring dependencies that still expected SNMPv2c during the migration window. Other devices sat behind jump hosts where direct collection was intentionally blocked.

You may also find this useful: Check out our guide on Python Network Config Backup: Automating Multi-Vendor Device Snapshots for more practical tips.

I treated exceptions as first-class records instead of comments in a spreadsheet. Each exception needed a control ID, device or device group, business owner, expiration date, compensating control, approval reference, and retest date. If an exception had no expiration date, I considered it a policy bypass dressed up as documentation. That opinion annoyed a few people at first, but it made audit reviews shorter and cleaner.

No expiration, no exception.

Compensating controls needed the same evidence discipline as normal controls. If a FortiGate firewall could not meet a logging requirement exactly because of a legacy integration, we documented the alternate syslog path, collector retention setting, restricted admin access, and review cadence. If a Cisco distribution switch could not enforce the current banner or timeout standard due to an operational constraint, we documented why, who accepted it, and when we would remove it.

The best part was trend visibility. Once exceptions lived beside normal findings, we could show which ones were aging, which ones repeated across sites, and which ones were really engineering debt. My opinion is that exception tracking is where compliance automation becomes operationally useful instead of just audit-friendly.

Run Compliance As An Engineering Workflow

We stopped treating NIST checks as a quarterly scramble and started running them like a production job. The collector runs from Ubuntu 22.04, uses Python 3.11 virtual environments, writes structured JSON, and pushes report artifacts into our internal evidence repository. We schedule the full 150-device run before the formal audit window, then run smaller deltas after firewall changes, switch upgrades, and identity platform maintenance.

I keep the control definitions in version control because compliance logic changes need peer review. A change to the AU-12 requirement is no different from a firewall policy change in my mind; both can create false confidence if nobody reviews the intent. We tag control packs by audit period, so when an auditor asks what rules were used for Q2, I can point to the exact commit instead of saying we think the script was current.

The script is not the control.

The control is the operational behavior we can prove. Automation only gives us a repeatable way to observe it. That distinction matters when a device passes the parser but the process around it is weak, or when a device fails because our standard is stricter than the minimum NIST 800-53 language. I would rather defend a strict internal baseline than explain why we accepted weak configuration because the framework wording left room.

My final stance is blunt: manual NIST review across network devices is not a badge of diligence anymore. It is slow, inconsistent, and too dependent on whoever drew the audit task that week. Automated compliance checking does not remove engineering judgment, but it puts that judgment where it belongs: in control mapping, exception review, and remediation decisions.

The report mattered, but the real win was turning compliance from a quarterly memory test into repeatable engineering evidence.

External References


·

·