← All articles

Multi-technology orchestration with Robot Framework

by Rainer Haupt

TL;DR: Robot Framework’s distinctive strength is not the reporting layer, not readability, not the Foundation. It is the technology-agnostic orchestration layer: GUI, REST, SOAP, database, SSH and CLI in one test suite with a unified report. Playwright and Cypress are specialised browser tools — they fail in enterprise scenarios with three or more technologies. Python as the glue language under RF makes the difference. The weakness remains: around 30–40 % execution overhead versus pytest. Above 10,000 tests that becomes a CI/CD bottleneck.

Reading time approx. 13 min · As of: 2026-04


Comparing Robot Framework against Playwright or pytest pits frameworks that solve different problems against each other. Playwright is an excellent browser-automation tool. pytest is a first-class test runner for Python code. Both have a clear sweet spot — and that sweet spot is not “the entire enterprise test flow”. This is exactly where RF plays out its only truly differentiating advantage: a test that orchestrates browser, REST, SOAP, database, SSH and CLI cannot be expressed in any of those tools without significant custom engineering effort.

One test suite, six technologies

The RF core knows nothing about the test object — everything runs through replaceable libraries. A single .robot file can combine:

LibraryTechnologyUnderlying tech
SeleniumLibrary / BrowserBrowser GUISelenium / Playwright
RequestsLibraryREST APIsPython requests
SoapLibrary / ZeepLibrarySOAP APIsPython zeep
DatabaseLibrarySQL assertions (PostgreSQL, MySQL, Oracle, MSSQL, SQLite)DB-API 2.0
SSHLibraryRemote commandsPython paramiko
Process / OperatingSystemCLI, file systemBuilt-in

All in a single HTML report, with tag statistics and keyword execution traces.

Practice confirms the pattern. Spryker runs an open-source RF suite that combines API tests (RequestsLibrary) with UI tests (Browser/Playwright) in one project. Accruent Zoomba bundles APILibrary, GUILibrary and SOAPLibrary in one import. Sogeti Labs reports “great experiences with Robot Framework in complex system landscapes, including API, E2E, Web Application, and Back End Testing”. RoboCon 2025 in Helsinki ran a workshop on BPMN-orchestrated end-to-end test orchestration with RF.

The DatabaseLibrary deserves a note: it supports multiple concurrent DB connections with aliases. A test can run queries against a PostgreSQL transactional database and a MySQL reporting database in parallel and assert on both results. A typical scenario: verify a REST API call by checking the transactional and reporting databases simultaneously.

Where Playwright and Cypress hit limits outside the browser

Playwright explicitly positions itself as a “framework for Web Testing and Automation”. With APIRequestContext it covers REST — that works. Database, SSH, SOAP: each time, pull in your own npm package and write custom code. There is no PlaywrightSSHLibrary, no PlaywrightSOAPLibrary, no PlaywrightDatabaseLibrary. Each team builds the integration layer from scratch.

Cypress is even more explicit. The own documentation says: “Cypress is not a general purpose automation tool.” The only escape hatch for non-browser tests is cy.task() — a bridge from the browser context into a Node.js process. cy.task() has a 60-second default timeout, no retry mechanism, only accepts JSON-serialisable arguments, allows only single-argument calls, and blocks execution. GitHub Issue #14419 documents an SSH tunnel integration that worked in Cypress 5.6 and broke completely in Cypress 6.0. SOAP support is essentially absent.

Side by side:

CapabilityRobot FrameworkPlaywrightCypress
SOAP APIsSoapLibrary, ZeepLibrary, SudsLibraryno support — manual XML/HTTPno support
DB assertionsDatabaseLibrary (multi-DB, aliased connections)manual npm package importcy.task() escape hatch
SSH commandsSSHLibrary (paramiko-based)no support — manual npm packagecy.task() (fragile)
Message queuescustom Python library (trivially built)manual npm packagecy.task()
Unified reportingall technologies in one HTML reportseparate per capabilitybrowser-focused only
Multi-tech in one testnative, one .robot fileown integration layer requiredown integration layer required

For three or more technologies in coordinated test flows — browser submit, then API verify, then DB check, then SSH log file — RF provides ready-made, community-maintained libraries for each step. Playwright and Cypress require significant custom engineering for the same.

Python as the glue language is the real competitive edge

The «thin Robot layer» architecture turns every Python function into an RF keyword automatically: def connect_to_database() becomes the keyword Connect To Database. The @keyword decorator and PythonLibCore enable larger modular libraries.

The real strength lives in the language ecosystems Playwright and Cypress have to compete with:

AreaPythonJavaScript / Node.js
SOAPzeep — mature, pythonic, CLI for WSDL inspectionnode-soap — self-described as “still working out some kinks regarding namespaces”
SSHparamiko — gold standard since 2003ssh2 — less mature, less widespread
DB interfaceDB-API 2.0 (PEP 249) — standardised across all driversno standardised interface
Scientific / datapandas, numpy, scipyless mature
Legacy systemsexcellent (SOAP, LDAP, SNMP, Telnet)patchy

From the RF forum: “My approach is to only write test cases in robot files and all implementations will be on Python as custom libraries.” Xebia adds: “These Python functions are basically small wrappers — through them the integration layer wraps the external test driver.”

What Robocorp wrote when deprecating RF in 2024 is striking: “Robot Framework has served us well — all based on Python. By focusing on Python only, we allow scale and speed that only comes with Python’s ecosystem.” That confirms the point indirectly: the power was always in Python. The question is only whether the RF layer above adds enough value to justify the overhead.

When non-developers actually write RF tests

The RF marketing promise that “testers without a programming background write tests” is qualified in many field reports. The reality is sharper: non-developers do write RF tests, but only under clearly nameable preconditions.

The precondition is a keyword library built up by the technical team beforehand. A Hacker News report (anbotero) describes the process: “a large set of frequently used phrases” for UI and API interaction, six focused months of build-up work, after which “people were happy”. Important distinction in the same report: those were “technical PMs — PMs that read API documentation — just not ‘coding much’ PMs”. So domain experts with basic technical literacy, not non-developers in the strict sense.

The organisational preconditions are nameable: layered keyword architecture (Xebia describes three levels — technical workflow, workflow activity, business rule), BDD/Gherkin syntax support in RF, prepared keyword vocabularies for the domain, RF style-guide discipline (“Test cases should not look like scripts”).

Substantially more common and more realistic is the read scenario: business analysts, product owners and auditors verify RF tests but do not write them themselves. Well-designed RF tests read like specifications:

*** Test Cases ***
Complete Order With Discount Code
    Open Product Page    Laptop X1
    Add Product To Cart
    Enter Discount Code    SUMMER2025
    Submit Order
    Verify Order Confirmation    Discount 10%

A PO or BA can verify whether that is the correct business process. The pytest equivalent — a function test_order_with_discount_code — is harder for non-technical stakeholders to read.

Against Gherkin/Cucumber, RF has the advantage that it bundles executable keyword libraries out of the box. Cucumber/Gherkin requires teams to code all step definitions themselves. RF’s entry barrier for the thin-layer pattern is noticeably lower.

Compliance documentation in regulated industries

Confirmed RF users in regulated sectors provide the reality check. KONE Corporation — safety-critical embedded software for elevators and escalators, RF Foundation member. Nokia — telecom equipment validation, RF-based Network Test Automation (NTA). OP Financial Group — Finland’s largest financial services group, with a RoboCon 2025 talk. CERN — physics research, RF quickstart guide documented.

What RF concretely delivers for compliance:

RF featureRegulatory benefit
output.xmlMachine-readable, archivable, auditable
report.htmlStakeholder-readable summary with tag statistics
log.htmlExhaustive keyword execution traces with timestamps
[Documentation]Test case description in natural language
[Tags]Mapping to requirement IDs ([Tags] REQ-001 RISK-HIGH) — Requirements Traceability Matrix
Plain-text formatVersion control, change history, diff capability
Keyword separation”What to test” (auditor-readable) versus “how to test” (technical)
Translation (RF 6.0+)Section headers and BDD prefixes in local languages

What RF does not deliver — the honest gap: no built-in electronic signatures (FDA 21 CFR Part 11), no formal tool qualification for safety-critical standards (DO-178C/DO-330), no native ALM / requirements-management integration. Jira, ReportPortal and Allure serve as third-party bridges. RF provides excellent raw material for compliance documentation; the gap to formal compliance is closed by the surrounding infrastructure (CI/CD, DMS, ALM).

What the RF layer concretely adds over pytest

Five demonstrable benefits:

BenefitDetail
Tag systemStrongest differentiator at scale. Per-tag statistics in the log header. Tags bindable to Jira bug IDs. Tesena: “There is tagging in pytest, but it is not logged. Almost no one does this.”
Built-in HTML reportingZero-config. pytest needs Allure or pytest-html as separate plugins.
Suite setup / teardownShares state naturally across all tests in a suite. pytest scope='class' fixtures create separate instances — environment provisioning becomes more complex.
Enforced test organisationOpinionated section structure (*** Settings ***, *** Variables ***, *** Test Cases ***, *** Keywords ***) prevents wild-west sprawl in large codebases.
Listener interfaceEvent-driven extensibility without test modification.

Four costs as well:

CostDetail
30–40 % slowerKeyword-parsing overhead. At 10,000+ tests, that means hours of difference per run.
Parametrisationpytest @pytest.mark.parametrize dynamically generates 70+ combinations from one definition. RF needs explicit test definitions or the Test Template construct.
Plugin ecosystempytest has over 1,300 plugins. RF’s plugin market is smaller and more mixed in maintenance quality.
Debuggingpytest with the standard Python debugger and IDE breakpoints. RF DSL makes debugging significantly harder.

The inflection point sits around 10,000 tests. Maxilect, which uses both frameworks across different projects, arrives at a practical heuristic: RF where acceptance testing and non-developer collaboration matter, pytest where developer teams need speed. Several documented enterprise RF → pytest migrations happened above this threshold for performance reasons.

Verdict — when RF (thin layer) is the right choice

Robot Framework is superior when five or more test technologies converge in coordinated test flows, when mixed-skill teams are involved, when regulated industries need readable test documentation and audit-grade HTML reports, when BAs and POs read tests as living documentation, or when unified reporting across all technologies is required.

Robot Framework is not superior when the test suite stays in a single technology (browser only or API only), when the team is purely developers without non-technical stakeholders, when the suite reaches 10,000+ tests and execution speed becomes CI/CD-critical, or when only unit and integration tests are at stake — pytest directly is the better choice for that.

The pragmatic hybrid pattern from multiple field reports: RF for acceptance and end-to-end tests (readability, multi-tech orchestration, reporting), pytest for API, integration and unit tests (speed, flexibility, debugging), and in both cases RF + Browser Library instead of RF + Selenium for noticeably higher browser performance. That is the configuration in which RF plays out its one truly differentiating advantage — without the ballast the language carries elsewhere.

Sources


Evaluating Robot Framework for a heterogeneous test landscape, or assessing the RF + pytest hybrid? In the UTAA workshop we set the architecture decision specifically for your project. More on the method or request directly.

Request callback