TL;DR
Qualcomm AI Hub provides API access to real Snapdragon devices for compiling, optimizing, and profiling AI models. EdgeGate wraps AI Hub's API with regression testing, quality gates, and signed evidence bundles. Setup takes four steps: get your AI Hub API token, connect it to EdgeGate, create a pipeline with target devices and gates, then wire it into GitHub Actions. Total time: about 15 minutes.
What Is Qualcomm AI Hub and Why Does It Matter for Edge AI?
Qualcomm AI Hub is a cloud service that gives developers API access to real Snapdragon hardware for model compilation, optimization, and profiling. Instead of buying physical devices and setting up a test lab, you submit your model through the API and get back real-device performance data.
AI Hub supports the full Snapdragon portfolio — from mobile chipsets like the Snapdragon 8 Gen 3 to automotive and IoT platforms. It handles model compilation for the Hexagon NPU, quantization, and operator-level profiling.
The limitation is that AI Hub gives you raw infrastructure. It compiles your model and runs it on a device, but it doesn't provide regression testing, quality gates, evidence bundles, or CI/CD integration. That's the gap EdgeGate fills.
How Does EdgeGate Work with Qualcomm AI Hub?
EdgeGate sits on top of AI Hub's API and adds a testing layer. When you submit a model to EdgeGate, here's what happens:
- Model submission: You upload an ONNX model (with embedded weights) to EdgeGate, either through the dashboard or the CI API.
- Compilation: EdgeGate calls AI Hub's compile API to build a device-optimized binary for your target Snapdragon chipset.
- Profiling: The compiled model runs on a real Snapdragon device through AI Hub's profile API. EdgeGate collects inference latency, peak memory, and model size metrics.
- Gate evaluation: Results are compared against your quality gates. Each metric gets a PASS or FAIL verdict.
- Evidence generation: EdgeGate produces an Ed25519-signed evidence bundle containing the model SHA-256 hash, device attestation, raw performance data, gate verdicts, and a tamper-proof signature.
- CI reporting: If triggered from GitHub Actions, results appear as a PR check with a summary of gate verdicts and a link to the full evidence bundle.
How Do You Get a Qualcomm AI Hub API Token?
Before connecting EdgeGate, you need a Qualcomm AI Hub account and API token.
- Create an account at aihub.qualcomm.com. You'll need to agree to Qualcomm's developer terms.
- Navigate to API settings in your AI Hub dashboard.
- Generate an API token. This is a long-lived token that authenticates your requests. Store it securely — you'll add it to EdgeGate in the next step.
- Verify access by checking your available device targets. Free-tier AI Hub accounts have access to a subset of Snapdragon devices. Paid accounts unlock the full portfolio.
Your AI Hub token determines which devices you can test on and how many concurrent compilation/profiling jobs you can run. EdgeGate respects these limits and queues jobs if your AI Hub quota is full.
How Do You Connect AI Hub to EdgeGate?
Once you have your AI Hub API token, connecting it to EdgeGate takes about 2 minutes:
- Log in to EdgeGate and navigate to Settings → Integrations.
- Paste your AI Hub API token in the Qualcomm AI Hub section. EdgeGate encrypts it at rest and never exposes it in logs or evidence bundles.
- Select your target devices. EdgeGate queries AI Hub to show which devices your token can access. Pick the devices your models deploy to — typically Snapdragon 8 Gen 3 for mobile or SA8295P for automotive.
- Run a connectivity test. EdgeGate submits a small probe model to verify the token works and the selected devices are available. This takes about 30 seconds.
# Alternatively, connect via the EdgeGate API:
curl -X POST "https://edgegateapi.frozo.ai/v1/workspaces/{ws_id}/integrations/qaihub" \
-H "Authorization: Bearer {token}" \
-H "Content-Type: application/json" \
-d '{"token": "your-ai-hub-api-token"}'
# Then run a connectivity test:
# Dashboard → Settings → Integrations → Run ProbeSuite
# Output:
# ✓ AI Hub token valid
# ✓ Device sm8650 available (Samsung Galaxy S24 Family)
# ✓ Probe model profiled successfullyHow Do You Create Your First Pipeline?
A pipeline in EdgeGate defines what you test, where you test it, and what "good" looks like. Here's how to set one up:
- Navigate to Dashboard → New Pipeline
- Name your pipeline (e.g., "person-detection-prod")
- Select target devices. You can test on a single device or a matrix of devices. For most teams, start with your primary deployment target.
- Define quality gates:
| Gate | Recommended Default | Adjust Based On |
|---|---|---|
| Inference latency | ≤ 50ms | Your application's frame rate or response time SLA |
| Peak memory | ≤ 500MB | Your device's available memory budget for the model |
| Model size | ≤ 10 MB | Deployment size constraints for your target device |
- Configure statistical settings: Set warmup iterations (default: 3) and benchmark iterations (default: 10). EdgeGate uses median-of-N gating to handle the natural variance of on-device measurement.
- Save and run a test. Upload an ONNX model or click "Run ProbeSuite" to test with EdgeGate's built-in probe model. Your first results will be ready in about 2 minutes.
What Does the Model Submission Format Look Like?
EdgeGate accepts ONNX models with embedded weights. This is important — ONNX models with external weight files are not supported, because the compilation API requires a single self-contained artifact.
If your model uses external weights (common with large models exported from PyTorch), you'll need to convert it first:
import onnx
# Load model with external weights
model = onnx.load("model.onnx", load_external_data=True)
# Save with embedded weights
onnx.save(
model,
"model_embedded.onnx",
save_as_external_data=False # Embed weights in the .onnx file
)
# Verify
model_check = onnx.load("model_embedded.onnx")
onnx.checker.check_model(model_check)
print(f"Model size: {os.path.getsize('model_embedded.onnx') / 1024:.0f} KB")EdgeGate computes a SHA-256 hash of the model file at submission time and includes it in the evidence bundle. This ensures the evidence bundle cryptographically references the exact model that was tested.
How Do You Wire This into GitHub Actions?
The final step is connecting your EdgeGate pipeline to your CI system. For GitHub Actions, add a workflow file to your repository:
name: EdgeGate Regression Test
on:
pull_request:
paths:
- 'models/**'
- 'training/**'
- 'quantization/**'
jobs:
edge-regression-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run EdgeGate Pipeline
env:
EDGEGATE_WORKSPACE_ID: ${{ secrets.EDGEGATE_WORKSPACE_ID }}
EDGEGATE_API_SECRET: ${{ secrets.EDGEGATE_API_SECRET }}
EDGEGATE_PIPELINE_ID: ${{ secrets.EDGEGATE_PIPELINE_ID }}
run: |
TIMESTAMP=$(date +%s)
NONCE=$(openssl rand -hex 16)
PAYLOAD="$TIMESTAMP.$NONCE"
SIGNATURE=$(echo -n "$PAYLOAD" | \
openssl dgst -sha256 -hmac "$EDGEGATE_API_SECRET" -hex | \
awk '{print $2}')
RESPONSE=$(curl -sf "https://api.edgegate.ai/v1/ci/github/run" \
-X POST \
-H "Content-Type: application/json" \
-H "X-EdgeGate-Workspace: $EDGEGATE_WORKSPACE_ID" \
-H "X-EdgeGate-Timestamp: $TIMESTAMP" \
-H "X-EdgeGate-Nonce: $NONCE" \
-H "X-EdgeGate-Signature: $SIGNATURE" \
-d '{
"pipeline_id": "'$EDGEGATE_PIPELINE_ID'",
"model_path": "models/person_detect.onnx",
"commit_sha": "'${{ github.sha }}'",
"branch": "'${{ github.head_ref }}'",
"pull_request": '${{ github.event.number }}'
}')
echo "Run ID: $(echo $RESPONSE | jq -r '.run_id')"
echo "Status: $(echo $RESPONSE | jq -r '.status')"
echo "Dashboard: $(echo $RESPONSE | jq -r '.dashboard_url')"Authentication uses HMAC-SHA256 with anti-replay protection (timestamp + nonce). This prevents intercepted requests from being replayed and ensures only your CI system can trigger runs.
What Happens When a Gate Fails?
When a quality gate fails, EdgeGate blocks the merge and provides actionable diagnostics:
- Which metric failed and by how much (e.g., "Inference latency: 67ms, gate: ≤ 50ms, exceeded by 34%")
- Device details — exact chipset, device model, firmware version
- Comparison to baseline — how the result compares to the last passing run on the same device
- Operator-level breakdown (when available) — which operators contributed most to latency or memory
- Evidence bundle link — the full signed report for audit and debugging
The PR check shows a summary, and the full evidence bundle is accessible from the EdgeGate dashboard. Your team can investigate the regression without needing physical access to a Snapdragon device.
What Are Common Integration Troubleshooting Issues?
| Issue | Fix |
|---|---|
| "AI Hub token invalid" | Regenerate your token at aihub.qualcomm.com. Tokens expire after 90 days. |
| "Device not available" | Check AI Hub device availability. Some devices have limited capacity during peak hours. |
| "Compilation failed" | Usually an unsupported operator in your ONNX model. Check AI Hub's supported operator list for your target chipset. |
| "External weights detected" | Your ONNX model has external weight files. Convert to embedded weights using onnx.save(model, path, save_as_external_data=False) |
| "HMAC signature mismatch" | Clock skew between your CI runner and EdgeGate. Ensure your runner's system clock is synced (NTP). Timestamps must be within 5 minutes. |
Get started with Qualcomm AI Hub + EdgeGate
Connect your AI Hub token, create a pipeline, and run your first on-device test in about 15 minutes.
Related Articles
Building a CI/CD Pipeline for On-Device AI Models
Step-by-step guide to adding regression gates in every pull request.
Evidence Bundles: Software Release Rigor for ML
Cryptographically signed proof that every model passed quality gates on real hardware.
Model Quantization Testing for Edge AI: FP32 vs INT8 on Real Hardware
How to catch quantization regressions on Snapdragon devices before production.