Powered by Qualcomm AI Hub

Your edge AI passed in the cloud. Can you prove it on the device?

Quantize a model for the NPU and it can silently lose accuracy, fall back to CPU, or burn 4× the battery. When a safety assessor asks what shipped, a green checkmark isn't evidence. EdgeGate gates every model on real Snapdragon hardware in CI and signs an ISO 26262-traceable evidence bundle for every result.

Your model never leaves your AWS account.

Free early access. No credit card required.

group50+ teams on waitlist
Recognized byNVIDIA Inception Program
See It In Action

From model upload to hardware-validated CI

Watch how EdgeGate catches on-device regressions before they reach production — in under 60 seconds.

The Problem

Edge AI breaks in ways you can't test for — until now

Emulators and cloud GPUs can't replicate what happens on a real Snapdragon device in the field.

cloud_off

Cloud tests lie

Your model scores 95% accuracy on cloud GPUs. On a Snapdragon chipset running at 40°C? It drops to 71%. Cloud benchmarks don’t predict on-device behavior.

device_thermostat

Hardware is unpredictable

Thermal throttling, firmware updates, and power states change how your model runs. These variables don’t exist in simulation — only on real silicon.

bug_report

Regressions ship silently

A weight update that looks fine in your training pipeline quietly degrades latency by 3x on device. Without hardware-in-the-loop CI, you won’t know until users complain.

0.18ms
Measured Inference on Snapdragon 8 Gen 3
2/2
Gates Passed (FP32 & INT8)
121 MB
Peak Memory (Under 150 MB Gate)
100%
Signed Evidence Bundles
How It Works

From git push to hardware-validated in minutes

Add hardware regression testing to your existing CI/CD pipeline. No infrastructure to manage.

01
upload_file

Push your model

Upload your ONNX model and create a pipeline in the dashboard. Set pass/fail gates for inference time and peak memory on your target device.

# Pipeline config (via Dashboard or API)
model: resnet18_fp32.onnx
format: onnx  # embedded weights
gates:
  inference_time_ms: "<=1.0"
  peak_memory_mb: "<=150"
device: sm8650  # Samsung Galaxy S24
02
developer_board

Test on real hardware

EdgeGate runs your model on physical Snapdragon devices through Qualcomm AI Hub. No emulators. Median-of-N measurements with warmup exclusion for deterministic results.

[EdgeGate] Device: Samsung Galaxy S24 (SM8650)
[EdgeGate] Compiling via Qualcomm AI Hub...
[EdgeGate] Profiling on-device (median-of-N)
[EdgeGate] Inference: 0.176ms  ✓ (gate: ≤1.0ms)
[EdgeGate] Peak memory: 121.51 MB  ✓ (gate: ≤150 MB)
[EdgeGate] Model size: 1.07 MB (270,146 params)
03
verified

Gate your PR

Results flow back to your CI pipeline as a pass/fail gate. Failed gates block the merge. Every run produces a signed evidence bundle with SHA-256 hashes for auditability.

✓ 2/2 GATES PASSED — PR #247 can merge

Evidence bundle: dc2e9f67
  model_hash: sha256:4f8a2c...
  signed: Ed25519 (workspace key)
  device: SM8650 (Samsung Galaxy S24)
  inference: 0.176ms | memory: 121.51 MB
NEW · v0.1.0 · MCP Server

Set up edge-AI CI gates by talking to Claude.

EdgeGate's MCP server turns 10 minutes of dashboard clicks into one prompt. Works with Claude Code, Cursor, and Claude Desktop.

claude-code — project/mobile-sdkOne prompt

You

Use the edgegate MCP to set up a CI gate for my MobileNet ONNX model. Gates: inference_time_ms ≤ 10, peak_memory_mb ≤ 150. Devices: Galaxy S24, Galaxy S23.

edgegate_setup_workspace
{ "name": "Mobile Team" }

✓ workspace_id: ws_9f2a1c

edgegate_upload_model
{ "path": "./mobilenet_v2.onnx" }

✓ model_id: mdl_3e8b07

edgegate_create_pipeline
{ "name": "MobileNet Gates", "devices": ["sm8650","sm8550"],
  "gates": { "inference_time_ms": "<=10", "peak_memory_mb": "<=150" } }

✓ pipeline_id: pip_7c4d22

edgegate_run_gate
{ "pipeline_id": "pip_7c4d22" }

✓ run_id: jgomvedq5 (in flight)

edgegate_setup_github_action
{ "workspace_id": "ws_9f2a1c" }

✓ .github/workflows/edgegate.yml ready to commit

Claude

Done. Workspace Mobile Team, pipeline MobileNet Gates created, run jgomvedq5 in flight, and .github/workflows/edgegate.yml is ready to commit. Want me to push it?

Total elapsed~30 seconds
edgegate.ai — browserWithout MCP

What you would have done manually

  1. 1Log into EdgeGate dashboard
  2. 2Navigate to Settings → AI Hub Integration
  3. 3Paste Qualcomm AI Hub API token
  4. 4Create new workspace for your team
  5. 5Click Upload Model → pick your ONNX file
  6. 6Wait for upload (varies by model size)
  7. 7Create pipeline → set name and description
  8. 8Add devices: Galaxy S24, Galaxy S23
  9. 9Set gates: inference_time_ms ≤ 10, peak_memory_mb ≤ 150
  10. 10Settings → CI/CD → Generate HMAC key
  11. 11Copy workflow YAML → add to .github/workflows/
  12. 12Commit and push to verify CI wires up
Typical wall clock≈ 10 minutes
bolt

30 seconds, not 10 minutes

Total wall clock from prompt to wired CI. Includes workspace creation, model upload, pipeline config, and the gh commands.

code

Stays in your editor

Never context-switch to the browser. The MCP server is a real MCP tool running locally — not a chat widget or browser extension.

verified

Same signed evidence bundles

MCP calls hit the same API as the dashboard. Audit trails, Ed25519 signatures, and SHA-256 manifests are identical.

Get started in 2 steps

terminal
# 1. Generate an API key in Settings → API Keys
# 2. Install in Claude Code (MCP + 6 slash commands):
/plugin install https://github.com/frozo-ai/edgegate-mcp

# Cursor / Claude Desktop? Use the interactive installer:
npx edgegate-mcp-install
# That's it. Restart your MCP client and start prompting.
Features

Everything you need to ship edge AI with confidence

Purpose-built for teams deploying AI models to Snapdragon-powered devices.

memory

Quantized models silently fall back to CPU — 4× the battery, half the speed

Silent NPU Fallback Detection

EdgeGate measures npu_compute_percent on every run and blocks merges when execution drifts off the NPU. Catches the failure mode that latency-only gates miss — inference passes, but the model is now running on a path that drains battery in hours, not days.

grid_view

Testing model variants across devices is a serial, slow process

Multi-Model × Multi-Device Matrix

Compare FP32, INT8, pruned, distilled — across Snapdragon 8 Gen 3, X Elite, and your full target fleet — in one run. Up to 25 cells, executed in parallel via Celery chord fan-out. One gate blocks the merge if any cell fails.

compare_arrows

You can see the new run’s metrics, but not what changed since main

Run-Over-Run Signed Diffs

Every run computes a structured diff against its baseline: metric deltas, gate flips, per-device breakdown. The diff is folded into the signed evidence bundle, so the commit→commit comparison is cryptographically anchored — not just a UI rendering.

chat

Vision gates don’t catch LLM regressions — latency means nothing without throughput

LLM Gating (TTFT · TPS · NPU%)

First-class gates for time-to-first-token, tokens/sec, and NPU residency on small LLMs (TinyLlama, Phi-3-mini, Llama 3.2 1B/3B). Catch the case where a quantization tweak halves throughput while leaving per-token latency unchanged.

verified_user

Downstream systems trust your CI status, but can’t verify it themselves

Signed Bundle Registry

Every run produces an Ed25519-signed evidence bundle exposed via a public registry. Fleet management, MLOps, and compliance tools fetch + verify bundles directly. Two-sided audit log makes tampering detectable from either end.

integration_instructions

Hardware testing is a manual, out-of-band process

CI/CD Native

Drop EdgeGate into GitHub Actions or GitLab CI. One workflow step. Results appear as PR checks. HMAC-signed CI requests with replay protection. Workspace concurrency = 1, so parallel CI jobs serialize cleanly without stomping on devices.

For your security team

Three guarantees, by design.

Run gates on your proprietary models without handing them over.

lock

Read-only IAM

s3:GetObject + s3:HeadObject only. Never PutObject or Delete*.

fingerprint

External ID required

Workspace-scoped UUID stops confused-deputy attacks at STS.

fact_check

1:1 CloudTrail correlation

Every aws_request_id in our audit appears in your CloudTrail.

edgegate_register_byo_bucket

30 min from contract signed → first gated run

Contact Sales

Unprecedented
Performance Insights

Visualize your model's performance delta across different firmware versions, temperatures, and battery states. Our dashboard provides a granular view that emulators simply cannot match.

  • done_allInference Latency (Median-of-N Gating)
  • done_allPeak Memory vs. Gate Threshold
  • done_allFP32 vs INT8 Model Comparison
edgegate — sm8650 (Galaxy S24)
Optimized
Model_V1Model_V2CurrentDev_3Dev_4Dev_5
FP32 INFERENCE
0.176ms
INT8 INFERENCE
0.187ms
MODEL SIZE (INT8)
-70%
Built For

For teams shipping AI to real devices

Whether you're building robots, drones, smart cameras, or mobile AI features — if it runs on Snapdragon, EdgeGate is your regression safety net.

psychology

ML Engineers

You train and optimize models for edge deployment. EdgeGate lets you validate that your INT8 quantization actually works on target hardware before merging.

  • checkModel quantization validation
  • checkAccuracy regression checks
  • checkCross-device compatibility testing
memory

Embedded / IoT Engineers

You build firmware and applications for Snapdragon-powered devices. EdgeGate catches latency regressions and thermal issues your desktop benchmarks miss.

  • checkLatency gate enforcement
  • checkThermal throttling detection
  • checkFirmware update impact testing
rocket_launch

DevOps / ML Platform Teams

You own the CI/CD pipeline. EdgeGate plugs into GitHub Actions or GitLab CI with one YAML file and gives you deterministic hardware gates.

  • checkCI/CD integration in minutes
  • checkHMAC-signed webhook triggers
  • checkSigned evidence for audit trails

Industries using EdgeGate

precision_manufacturingRobotics
directions_carAutomotive
smartphoneMobile
routerIoT / Edge
videocamSmart Cameras
health_and_safetyHealthcare Devices

Stop shipping blind to hardware

Join the waitlist for early access. Be the first to add hardware regression gates to your CI pipeline.

Free early access. No credit card required.

group50+ teams on waitlist