Case Study

Public Health Surveillance Reporting Modernisation

Client

National Health Organisation

Industry

Health & Medical

The challenge

Monthly population screening coverage reports required manual SQL extraction, R analysis, and Power BI visualisation — all done on-premise by a single analyst. The process was time-consuming, error-prone, and created single-point-of-failure risk.

Our approach

1. Process documentation: Mapped current workflow and identified bottlenecks 2. Technology assessment: Evaluated R/Quarto vs Power BI, assessed Fabric readiness 3. Capability building: Mentored team on advanced R techniques and statistical methods 4. Architecture advisory: Designed data lake approach and Fabric migration pathway

The solution

Documented a clear migration roadmap from legacy SQL Server backups to Microsoft Fabric, with interim improvements to report automation and statistical capabilities.

Impact

40% through automation Analyst time reduced

Fabric-ready architecture Migration roadmap

3 days → Same day Report delivery time

Eliminated Single-point-of-failure

Technologies

RQuartoPower BISQL ServerMicrosoft Fabric (planned)

The outcome

Clear migration roadmap established, improved analytical capabilities through mentoring, and cost reduction opportunities identified.

The Challenge

A national health organisation produced monthly population screening coverage reports to track uptake rates across regions, demographics, and screening modalities. The reports informed policy decisions, resource allocation, and public health messaging — yet the delivery mechanism was a fragile, manual workflow centred on a single analyst.

Each month, the analyst followed a multi-stage process: extract raw data from SQL Server backups (stored as flat files on local drives), clean and transform the data in R, compute coverage statistics and year-on-year comparisons, build visualisations in Power BI, and finally assemble the report package for distribution. The entire process took 2-3 working days and required the analyst to be physically present to manage file transfers and troubleshoot script failures.

Two structural risks emerged: the organisation had no documented process, so knowledge resided entirely in the analyst’s head; and the reliance on local backups meant that a single server failure could lose an entire month’s data. When the analyst took leave or changed roles, the reporting pipeline was at risk of complete disruption.

Workflow Analysis

The first engagement phase focused on process documentation and bottleneck identification. We mapped every step of the monthly reporting cycle, recording time spent, error rates, and handoff points between stages.

The analysis revealed that data extraction and cleaning consumed approximately 45% of total processing time — the largest single block. SQL queries were ad-hoc, embedded in R scripts with hard-coded date ranges and filter conditions. Every monthly run required manual adjustment of these parameters, creating a recurring opportunity for error.

Data transformation involved multiple manual reconciliation steps: matching patient records across separate databases using non-standardised identifiers, handling missing values through manual look-up, and resolving duplicate entries that the source system had not deduplicated. These steps were performed in separate R sessions, with intermediate results saved to CSV files on the analyst’s local machine.

The visualisation phase, while not the largest time consumer, carried the highest error risk. Power BI reports were built from manually exported data extracts, meaning a mistake in the extract step would propagate silently into the final visualisation — potentially misleading senior decision-makers without any indication that the underlying data was incorrect.

Technical Architecture

The solution was designed in two tiers: immediate improvements to the existing workflow, and a longer-term migration to a modern cloud analytics platform.

Interim automation. We restructured the R analysis into a modular, parameterised pipeline using the dbplyr package for database-aware data manipulation. Queries are now defined once and parameterised by month and region, eliminating the manual date-range adjustments that caused errors. The pipeline includes automated data quality checks at each stage — row count validation, null percentage thresholds, and cross-database referential integrity — with clear error messages when data quality falls below acceptable levels.

Report generation uses Quarto to combine the analysis, visualisation, and narrative commentary into a single reproducible document. Quarto renders both the statistical outputs (coverage rates, confidence intervals, trend statistics) and the Power BI embed codes into a unified HTML report. The entire pipeline — data extraction through final report — is triggered by a single R script execution, reducing a 2-3 day manual process to an automated run of approximately 30 minutes.

Long-term architecture. We designed a migration pathway from the current SQL Server backup model to Microsoft Fabric, providing a unified analytics platform with built-in data engineering, warehousing, and real-time visualisation capabilities. The architecture documents cover data source integration, lakehouse schema design, semantic model construction, and Power BI report deployment — all mapped to the organisation’s existing IT governance framework and procurement processes.

Capability Building

Technical solutions are only sustainable when the organisation can maintain them. We structured knowledge transfer around three tracks:

R proficiency. The analyst team received hands-on training in advanced R techniques relevant to their reporting work: functional programming patterns for reusable analysis code, dplyr and dbplyr for efficient data manipulation, and ggplot2/plotly for interactive visualisation. Training was project-specific — every exercise used the team’s actual data and reporting requirements, ensuring immediate applicability.

Statistical literacy. Coverage rate analysis requires understanding of denominators, eligibility criteria, and population movement between regions. We worked with the team to formalise these definitions into documented calculation methods, ensuring that future analysts can reproduce and audit every statistic without relying on institutional memory.

Pipeline maintenance. The new automated pipeline includes inline documentation, a runbook for common failure modes, and a structured logging system that records every processing step with timestamps and data quality metrics. The team was trained to diagnose and resolve common issues independently.

Business Process Improvements

Elimination of single-point-of-failure. The automated, documented pipeline no longer depends on any individual analyst’s knowledge or availability. The process can be executed by any team member using the runbook, and the parameterised scripts can be deployed to a shared execution environment.

Reporting speed. Monthly report delivery time was reduced from 3 working days to same-day availability. This enables the organisation to respond to ad-hoc data requests from senior management and external stakeholders within hours rather than waiting for the next monthly cycle.

Data quality and auditability. Every report is generated from a single, reproducible codebase. The automated quality checks catch data anomalies before they propagate into visualisations, and the full processing log provides an audit trail that can be reviewed for accuracy. This replaces the previous model where data errors could only be detected after the report had been distributed.

Cost trajectory. The Microsoft Fabric migration, once implemented, will further reduce infrastructure costs by consolidating the current multi-tool stack (SQL Server backups, local R installations, Power BI Pro licences, manual file management) into a single cloud platform with usage-based pricing.

Outcomes & Benefits

The project delivered both immediate operational improvements and a strategic roadmap for continued modernisation:

40% analyst time reduction — the automated pipeline eliminates the most time-consuming manual steps (data extraction, cleaning, reconciliation)
3 days reduced to same-day — monthly reports are available on the first business day of the reporting month, enabling timely decision-making
Single-point-of-failure eliminated — the documented, parameterised pipeline can be executed by any trained team member
Migration roadmap delivered — a detailed Fabric architecture plan, scoped to the organisation’s governance requirements and ready for procurement
Team capability uplifted — analysts now possess advanced R skills, statistical documentation practices, and pipeline maintenance knowledge that benefit all future analytical work
Error reduction — automated quality checks catch data anomalies at source, preventing incorrect figures from reaching decision-makers

The interim automation is already in production. The team independently executed the monthly report using the new pipeline during our engagement, with no errors detected during the transition period. The Fabric migration is now under formal review by the organisation’s IT steering committee.