Skip to content

fix(ci): migrate benchmarks to benchmarking-platform trigger#607

Open
jbachorik wants to merge 29 commits into
mainfrom
jb/bench-bp-trigger
Open

fix(ci): migrate benchmarks to benchmarking-platform trigger#607
jbachorik wants to merge 29 commits into
mainfrom
jb/bench-bp-trigger

Conversation

@jbachorik

Copy link
Copy Markdown
Collaborator

What does this PR do?:
Replaces the broken inline benchmark jobs with a benchmarking-platform bridge trigger. The old jobs ran the profiler inline via shell scripts and uploaded results to S3 themselves; the new approach delegates entirely to the BP pipeline at DataDog/apm-reliability/benchmarking-platform@java-profiler.

Motivation:
The existing benchmark CI was broken and unmaintained. The benchmarking-platform provides a reliable, standardised infra for running and tracking reliability benchmarks.

Additional Notes:

  • benchmarks-trigger is a GitLab bridge job — it cannot appear in needs: of downstream jobs, so post-benchmarks-pr-comment and publish-benchmark-gh-pages now run in a post-benchmarks stage ordered after benchmarks.
  • download-s3-reports.sh fetches reports uploaded by BP under s3://relenv-benchmarking-data/java-profiler/${CI_PIPELINE_ID}/.
  • The images.yml include is removed — the BP side manages its own Docker image build.
  • jb/bench-memory-limit-fix is superseded by this PR (the memory-heavy aarch64 jobs are gone).
  • The BP side implementation is in DataDog/benchmarking-platform PR Run tests on musl aarch64 #190.

How to test the change?:
Push to a branch and verify the GitLab pipeline shows a benchmarks-trigger bridge job that fires the BP downstream pipeline. The post-benchmarks-pr-comment and publish-benchmark-gh-pages jobs should run in the post-benchmarks stage after the trigger completes.

For Datadog employees:

  • This PR doesn't touch any of that.
  • JIRA: [JIRA-XXXX]

@jbachorik jbachorik added the AI label Jun 18, 2026
@datadog-datadog-prod-us1

This comment has been minimized.

@dd-octo-sts

dd-octo-sts Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

CI Test Results

Run: #27865993044 | Commit: 4db745c | Duration: 14m 5s (longest job)

All 32 test jobs passed

Status Overview

JDK glibc-aarch64/debug glibc-amd64/debug musl-aarch64/debug musl-amd64/debug
8 - - -
8-ibm - - -
8-j9 - -
8-librca - -
8-orcl - - -
11 - - -
11-j9 - -
11-librca - -
17 - -
17-graal - -
17-j9 - -
17-librca - -
21 - -
21-graal - -
21-librca - -
25 - -
25-graal - -
25-librca - -

Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled

Summary: Total: 32 | Passed: 32 | Failed: 0


Updated: 2026-06-20 09:00:35 UTC

@dd-octo-sts

dd-octo-sts Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Benchmark Results

Pipeline: https://gitlab.ddbuild.io/DataDog/apm-reliability/benchmarking-platform/-/pipelines/120034731 Commit: 1b03a1cdd1c14c67b26831dfcc41e7e1aa652ea6

✅ Within expected boundaries

No significant runtime deltas (all within run-to-run noise) and no internal-counter outliers.

Runtime details (per benchmark × JDK)
Benchmark JDK Latest Dev Δ (dev vs latest) Issues L/D
akka-uct 21 ✅ 10326 ms (7 iters) ✅ 10196 ms (7 iters) ≈ -1.3% (±23.3%) — / —
akka-uct 25 ✅ 8666 ms (8 iters) ✅ 8878 ms (8 iters) ≈ +2.4% (±16.4%) — / —
finagle-chirper 21 ✅ 6045 ms (11 iters) ✅ 6106 ms (11 iters) ≈ +1% (±46.1%) ⚠️ W:1 / ⚠️ W:1
finagle-chirper 25 ✅ 5426 ms (12 iters) ✅ 5531 ms (12 iters) ≈ +1.9% (±43.1%) ⚠️ W:1 / ⚠️ W:1
fj-kmeans 21 ✅ 2861 ms (22 iters) ✅ 2836 ms (22 iters) ≈ -0.9% (±4.3%) — / —
fj-kmeans 25 ✅ 2825 ms (22 iters) ✅ 2820 ms (22 iters) ≈ -0.2% (±4.5%) — / —
future-genetic 21 ✅ 2128 ms (29 iters) ✅ 2162 ms (29 iters) ≈ +1.6% (±4.6%) — / —
future-genetic 25 ✅ 1967 ms (31 iters) ✅ 2045 ms (30 iters) ≈ +4% (±4.6%) — / —
naive-bayes 21 ✅ 1285 ms (44 iters) ✅ 1237 ms (46 iters) ≈ -3.7% (±56%) — / —
naive-bayes 25 ✅ 1000 ms (57 iters) ✅ 1039 ms (55 iters) ≈ +3.9% (±56.4%) — / —
reactors 21 ✅ 17142 ms (5 iters) ✅ 16038 ms (5 iters) ≈ -6.4% (±17.7%) — / —
reactors 25 ✅ 18732 ms (5 iters) ✅ 17925 ms (5 iters) ≈ -4.3% (±4.9%) — / —
Internal counter details (ddprof)

ddprof internal counters, latest / dev (✅ = 0, · = unavailable):

Benchmark JDK Dropped rec Dropped jvmti Dropped trace Skipped WC AGCT fail Unwind fail
akka-uct 21 · / · · / · · / · · / · · / · ✅ / ·
akka-uct 25 ✅ / ✅ ✅ / ✅ 2 / 1 2207 / 2313 ✅ / ✅ ✅ / ✅
finagle-chirper 21 · / · · / · · / · · / · · / · ✅ / ·
finagle-chirper 25 ✅ / ✅ ✅ / ✅ ✅ / 3 8276 / 8507 ✅ / ✅ ✅ / ✅
fj-kmeans 21 · / · · / · · / · · / · · / · ✅ / ·
fj-kmeans 25 ✅ / · ✅ / · 2 / · 1274 / · ✅ / · ✅ / ✅
future-genetic 21 · / · · / · · / · · / · · / · · / ·
future-genetic 25 ✅ / ✅ ✅ / ✅ 2 / 2 2872 / 2882 ✅ / ✅ ✅ / ✅
naive-bayes 21 · / · · / · · / · · / · · / · ✅ / ·
naive-bayes 25 ✅ / ✅ ✅ / ✅ 2 / 2 3471 / 3508 ✅ / ✅ ✅ / ✅
reactors 21 · / · · / · · / · · / · · / · · / ✅
reactors 25 ✅ / ✅ ✅ / ✅ 1 / ✅ 1924 / 1840 ✅ / ✅ ✅ / ✅

@jbachorik jbachorik marked this pull request as ready for review June 20, 2026 08:51
@jbachorik jbachorik requested a review from a team as a code owner June 20, 2026 08:51

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bc5c810c9d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .gitlab/benchmarks/.gitlab-ci.yml Outdated
Comment thread .gitlab/benchmarks/.gitlab-ci.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant