GitHub Copilot vs Cursor vs Claude Code: A Network Engineer Tests All

My manager asked me to evaluate AI coding tools for the network ops team — a team that writes more Python scripts and YAML than application code.

Start With The Job, Not The Hype

I tested GitHub Copilot, Cursor, and Claude Code from the perspective of our manufacturing facility IT security team, where most of my useful code touches firewalls, switches, log collectors, backup jobs, and inventory files. We are not building a SaaS dashboard every week. We are usually writing Python 3.11 scripts on Ubuntu 22.04, cleaning JSON from firewall APIs, validating YAML, generating change reports, or automating repetitive checks against FortiOS 7.4.3 devices.

I initially benchmarked all three on writing Python web apps — which is completely irrelevant to what my team actually does with these tools. That was my mistake. The output looked impressive, but it told me almost nothing about whether the tool could help us normalize address objects, compare FortiGate policies, or produce a clean rollback plan before a maintenance window.

The tools are not competing for the same job. GitHub Copilot is the best autocomplete. Cursor is the best editor integration. Claude Code is the best infrastructure automation assistant. Once I stopped treating them like three versions of the same product, the evaluation became much more useful.

That distinction matters.

For our team, the question was not “Which AI writes the most code?” It was “Which tool reduces mistakes when we are changing security infrastructure under time pressure?” My opinion after testing is simple: network and security teams should stop asking for one winner and start matching each tool to the workflow where it is strongest.

Test On Infrastructure Work, Not Toy Problems

I rebuilt the test plan around tasks we actually perform. I used Python 3.11, Ubuntu 22.04, FortiOS 7.4.3 API responses, sample Cisco-style configuration exports, YAML inventory files, and sanitized firewall policy data from our environment. I did not use LeetCode, web app scaffolding, or generic CRUD examples because none of those resemble my normal Tuesday.

The strongest test was a policy audit workflow. I gave each tool a small repository with multiple files: an API client, a YAML device inventory, sample FortiGate address objects, policy exports, and a half-written report generator. The goal was to find unused address groups, detect overly broad services, flag policies without comments, and produce a CSV that our change review process could understand.

Parse FortiOS 7.4.3 policy export JSON without dropping nested address groups
Validate YAML device inventory before connecting to anything
Generate a CSV report with rule ID, source, destination, service, and risk note
Add retry and timeout handling to the API client
Write a rollback checklist from the detected policy changes

I also measured elapsed working time, not generated lines of code. For the same infrastructure automation tasks, Claude Code reduced my infrastructure script writing time by 55%; Copilot reduced it by 30%. Cursor landed between them depending on how much of the task stayed inside the editor versus the terminal.

Lines of code were the wrong metric.

The exact numbers came from three repeated passes through the same task set. My baseline manual run took 132 minutes. Copilot-assisted work took 92 minutes. Claude Code-assisted work took 59 minutes. Cursor varied from 70 to 83 minutes depending on how much refactoring I needed after the first pass. I trust elapsed time more than token counts, completion counts, or demo polish.

Claude Code Worked Best Across Files And Commands

Claude Code felt the most useful when the task crossed file boundaries. It understood that the YAML schema, the FortiGate API client, the parser, and the reporting module were part of one system. That mattered when I asked it to add validation before execution, because it found the config loading path, adjusted the calling code, and suggested a test shape that matched the existing repository.

The terminal integration also changed how I worked. I could ask it to inspect failures, run a script, interpret the traceback, patch the right file, and rerun the validation without me copying error text back and forth. In infrastructure automation, that loop is more valuable than a perfect first completion because our data is messy and our edge cases usually appear only after a real config export hits the parser.

def validate_device_inventory(devices: list[dict]) -> list[str]:
    errors = []

    for index, device in enumerate(devices, start=1):
        name = device.get("name")
        host = device.get("host")
        platform = device.get("platform")

        if not name:
            errors.append(f"device {index}: missing name")
        if not host:
            errors.append(f"{name or 'unknown'}: missing host")
        if platform not in {"fortigate", "cisco_ios", "panos"}:
            errors.append(f"{name or 'unknown'}: unsupported platform {platform!r}")

    return errors

What I didn’t expect was how much value came from planning. Claude Code was better at saying, “First normalize the vendor objects, then compare policy references, then generate the report,” instead of jumping straight into one function. That is how experienced infrastructure engineers think, especially when a script might touch production firewalls later.

The plan was the feature.

I would not use Claude Code as my only editor, and I still reviewed every command before it ran against anything real. But for multi-file automation, test repair, script hardening, and turning rough operational intent into working tooling, it was the strongest assistant in the group. For our environment, that is the highest-value category.

Cursor Felt Best When I Was Already Editing

Cursor won when I stayed inside the IDE and moved quickly between related edits. It was especially good when I had a parser open beside a test file and needed inline changes that respected the surrounding code. The chat felt closer to the editor than a separate assistant, and for refactoring awkward Python functions or cleaning up YAML handling, that closeness made the work feel faster.

Its biggest advantage over Copilot was context control. I could point it at the files that mattered, ask for a targeted rewrite, and keep reviewing the diff in place. That worked well for tasks like splitting a large firewall report function into smaller helpers or renaming fields across a few modules without losing track of the change.

Cursor was less impressive when the work became operational. When I needed to run scripts repeatedly, inspect terminal output, adjust arguments, and reason through environment state on Ubuntu 22.04, Claude Code felt more natural. Cursor could help, but I found myself doing more orchestration manually.

Editor gravity is real.

My opinion is that Cursor is the best fit for engineers who live in code all day and want AI directly embedded into editing. If our team were building a larger internal automation platform, I would want Cursor in the stack. For smaller network automation scripts, it is excellent, but it does not replace the need for a terminal-aware assistant.

Copilot Was Fastest At Small Completions

GitHub Copilot still has a clean place in our workflow. It is fast, quiet, and useful when I already know exactly what I am writing. If I start typing a function to convert FortiOS 7.4.3 service objects into a normalized dictionary, Copilot often finishes the boring parts correctly. It is also good at repetitive unit test structure, argparse boilerplate, and predictable data transformations.

Copilot struggled when I needed broader reasoning. It could complete the next block, but it did not consistently understand the whole operational goal unless I packed the surrounding files and intent into the prompt. That is not really a failure; it is just the wrong job. Autocomplete should not be expected to act like a senior engineer walking through a change plan.

I also noticed that Copilot could make weak assumptions look smooth. In one case, it generated code that handled flat address groups but missed nested groups. The code looked clean, passed a shallow sample, and would have produced a misleading report against real firewall data. That is exactly the kind of mistake my team worries about.

Smooth is not the same as correct.

I would still license Copilot for engineers who write scripts regularly. It reduces friction and keeps momentum high. But I would position it as an acceleration layer, not as the main assistant for infrastructure automation. In my view, Copilot is worth having, but dangerous to over-credit.

Choose Based On The Work Your Team Actually Does

For my network and security team, the recommendation is not a single-tool mandate. I would use Copilot for everyday autocomplete, Cursor for heavier editor-native refactoring, and Claude Code for infrastructure automation that spans files, commands, tests, and operational reasoning. That mix matches how we actually work.

If I had to pick one for our manufacturing environment, I would pick Claude Code first because our bottleneck is not typing. Our bottleneck is turning messy network intent into reliable automation without missing edge cases that can break a change window. The 55% writing-time reduction mattered, but the better planning and validation mattered more.

My practical rollout would be conservative. Keep production credentials out of prompts, use sanitized exports for development, require code review for generated changes, and run new scripts against read-only API endpoints before allowing write operations. AI does not remove our change process. It makes weak process fail faster.

That is the part I care about.

My final opinion is that network engineers should evaluate these tools with firewall exports, switch configs, inventory files, and real operational scripts. Web app demos will flatter the wrong strengths. In our environment, the best tool was the one that understood the work around the code, not just the next line of code.

GitHub Copilot vs Cursor vs Claude Code: A Network Engineer Tests All Three