Industrial AI Governance & MLOps worked example

Training Data Volume at 66% data capture uptime: a worked example

This worked example runs the training data volume numbers for a tougher week than the baseline: 66% data capture uptime instead of the typical 92%. Estimate usable training records produced from sensor or image data collection cycles after uptime and quality loss.

The inputs for this scenario

  • Training records per collection cycle: 250 records / cycle (held at the documented default)
  • Planned data collection cycles: 120 cycles (held at the documented default)
  • Data capture uptime: 66 % (the input this scenario stresses; the baseline uses 92)
  • Usable data quality yield: 85 % (held at the documented default)

Working through the calculation

  • The calculation starts from the formula this tool documents: Gross training data volume = training records per collection cycle × planned data collection cycles.
  • Usable training data volume works out to 16,830 records at these inputs, and this is the headline figure for the scenario.
  • Gross training data volume works out to 30,000 records at these inputs.
  • Data capture uptime loss works out to 10,200 records at these inputs.
  • Data quality yield loss works out to 2,970 records at these inputs.

How this compares with the baseline

  • Against the tool's baseline example, where data capture uptime sits at 92% and the headline result is 23,460 records, this scenario comes in 28.26% below the baseline at 16,830 records.
  • Use it when planning a data-collection campaign or validating whether a pipeline will produce enough clean records to train. A result at this level usually justifies acting on the stressed input before touching anything else, because every other figure in the table is downstream of it.

Results at a glance

  • Usable training data volume: 16,830 records (headline result)
  • Gross training data volume: 30,000 records
  • Data capture uptime loss: 10,200 records
  • Data quality yield loss: 2,970 records

Run it with your numbers

  • To rerun this with your own numbers, open the live Training Data Volume calculator, set data capture uptime to your actual value, and adjust the remaining inputs to match your operation.

Last reviewed 2026-05-12.