Skip to main content

Custom Metrics

Track domain-specific measurements — throughput, accuracy, error rates, or anything your benchmark produces. Artemis already records runtime, CPU, and memory automatically; custom metrics let you measure what matters to your project, then compare those values across versions of your code.

How Artemis reads your metrics

Your benchmark writes a results file; Artemis picks it up after every run — no extra configuration.

  1. Write a results file — have your benchmark script write an artemis_results.json or artemis_results.csv to the project root before it exits.
  2. Run a benchmark — run a benchmark from the project's execution settings, or let an optimisation run it. Each run produces one results file.
  3. Artemis picks it up — after the run, Artemis reads the file from the project root and stores every value. If both a .json and a .csv exist, JSON takes priority.
  4. Compare across versions — recorded values show up in the project's metrics table and feed the distribution and trend charts, so you can see how each version moves the numbers.

File format

Use any metric names you want. Each object (JSON) or row (CSV) is one measurement — run your benchmark multiple times to build a distribution.

artemis_results.json
[
{ "throughput": 4500, "error_rate": 0.02 },
{ "throughput": 4620, "error_rate": 0.01 },
{ "throughput": 4480, "error_rate": 0.03 }
]
artemis_results.csv
throughput,error_rate
4500,0.02
4620,0.01
4480,0.03
Numbers only

Every value must be a finite number. Keep metric names consistent across measurements so they line up into one series.

The rules Artemis enforces:

  • The file must be named exactly artemis_results.json or artemis_results.csv and live in the working directory (project root), not a subfolder.
  • Every metric value must be a finite number (integer or float). No strings, booleans, null, or nested objects.
  • Use clear, consistent metric names across runs (e.g. throughput, accuracy, error_rate, inference_ms).
  • One run → a single JSON object: {"throughput": 4500, "error_rate": 0.02}.
  • Multiple measurements → a JSON array of objects, or a CSV with a header row and one numeric row per measurement.
  • If both a .json and a .csv exist, Artemis uses the JSON.

Set it up with your coding agent

Custom metrics need a small change to your benchmark code. Paste this prompt into your own coding agent (Claude Code, Cursor, Copilot…) inside your repo — it reads your benchmark and adds the artemis_results output for you.

I'm setting up **custom metrics** for TurinTech Artemis.

Artemis runs my benchmark on a runner and, after each run, reads a results file from the project root to track performance metrics across versions of my code.

Please update my benchmark so that, right before it exits, it writes the key metrics it measures to an `artemis_results.json` (or `artemis_results.csv`) file in the project root.

Rules Artemis enforces:
- The file must be named exactly `artemis_results.json` or `artemis_results.csv` and live in the working directory (project root), not a subfolder.
- Every metric value must be a finite number (integer or float). No strings, booleans, null, or nested objects.
- Use clear, consistent metric names across runs (e.g. throughput, accuracy, error_rate, inference_ms).
- One run -> a single JSON object: {"throughput": 4500, "error_rate": 0.02}
- Multiple measurements -> a JSON array of objects, or a CSV with a header row and one numeric row per measurement.
- If both a .json and a .csv exist, Artemis uses the JSON.

Steps:
1. Read my benchmark code and list the numeric metrics it already computes.
2. Pick the ones worth tracking and add code to write them to artemis_results.json before the benchmark exits.
3. If the benchmark runs in a separate directory, copy the results file back to the project root before exiting.
4. Run the benchmark locally and confirm the file appears in the project root with numeric values.
5. Show me the diff.
Prefer a guided, end-to-end setup?

Install the repo-setup Artemis skill and run it in your terminal — it walks through build, test, and benchmark configuration, including custom metrics. See Artemis Skills.

Running in a separate directory

If your benchmark runs somewhere other than the project root — a clone, a build dir, a container mount — copy the results file back before the script exits.

# Run your benchmark wherever it lives, then copy the
# results file back to the project root before exiting.
ORIG=$(pwd)
cd /path/to/clone
./run_bench.sh
cp artemis_results.json "$ORIG/"

Artemis only looks at the project root after a run, so the file has to land there. Capturing $(pwd) first is the simplest way to copy it back from wherever your benchmark ran.

What you get

Once values are recorded, each metric gets a distribution (box plot) and a trend over time (line chart) — the same charts you'll see in the project's metrics table.

The box plot shows the spread — minimum, quartiles, and median — so you can spot noise and outliers at a glance. The line chart shows each measurement in order, so a drift or regression across runs is easy to see.