Person Detection on Snapdragon 8 Gen 3
FP32 vs INT8 model validation with automated quality gates and Ed25519-signed evidence bundles. Tested on real hardware via Qualcomm AI Hub.
Inference Time
FP32
0.1760ms
INT8
0.1870ms
✓ Gate: ≤ 1.0 ms
Peak Memory
FP32
121.51MB
INT8
124.66MB
✓ Gate: ≤ 150 MB
Model Size
FP32
1.07MB
INT8
0.32MB
Both models passed all quality gates
2/2 gates passed for FP32 • 2/2 gates passed for INT8 • Results are Ed25519-signed and tamper-proof
Executive Summary
EdgeGate automatically validated both FP32 and INT8 variants of a person-detection model on a real Snapdragon 8 Gen 3 (sm8650) device via Qualcomm AI Hub. Both models achieved sub-millisecond inference and passed all quality gates. The results are captured in Ed25519-signed evidence reports that provide tamper-proof proof of compliance.
The INT8 model is 70.5% smaller than FP32 (322 KB vs 1.07 MB) while maintaining comparable sub-millisecond inference. Both pass identical quality gates automatically.
Test Setup
Model Architecture
MobileNet-style depthwise separable CNN for binary person detection. Accepts 224×224 RGB images normalized to [0,1]. Outputs class logits for [no_person, person].
FP32: 270,146 parameters • 1.07 MB ONNX
INT8: 71,074 parameters • 322 KB ONNX
Target Device
Qualcomm Snapdragon 8 Gen 3 (sm8650), Samsung Galaxy S24 family. Tested via Qualcomm AI Hub cloud device farm.
Config: "Snapdragon 8 gen 3 multi"
Access: Qualcomm AI Hub API
Detailed Results
| Metric | FP32 | INT8 | Delta |
|---|---|---|---|
| Inference time | 0.1760 ms | 0.1870 ms | +6.2% |
| Peak memory | 121.51 MB | 124.66 MB | +2.6% |
| Model file size | 1.07 MB | 322 KB | -70.5% |
| Parameters | 270,146 | 71,074 | -73.7% |
| Gate pass rate | 2/2 (100%) | 2/2 (100%) | — |
Gate Evaluation
| Gate | Threshold | FP32 | INT8 |
|---|---|---|---|
| inference_time_ms | ≤ 1.0 ms | ✓ PASS(0.176 ms, 82.4% margin) | ✓ PASS(0.187 ms, 81.3% margin) |
| peak_memory_mb | ≤ 150 MB | ✓ PASS(121.51 MB, 19.0% margin) | ✓ PASS(124.66 MB, 16.9% margin) |
Key Observations
Both models are production-ready. Sub-millisecond inference with 80%+ margin against the gate threshold means there is significant headroom for more complex models or tighter latency budgets.
Model size is the primary INT8 benefit. The INT8 variant is 70.5% smaller on disk, which matters for OTA updates, storage-constrained devices, and download times. Inference performance is comparable.
Runtime memory is similar. Despite having 73.7% fewer parameters, the INT8 model uses slightly more peak memory (124.66 MB vs 121.51 MB). This is expected — the Snapdragon runtime allocates device-level resources that aren't proportional to model size alone.
Automated gating works. Both models were automatically evaluated against quality gates with clear PASS/FAIL verdicts. No manual interpretation needed.
Evidence & Auditability
Each benchmark run produced a signed evidence report containing model identity (SHA-256 hash), device attestation (hardware ID, firmware version, runtime configuration), test configuration, raw metrics, gate verdicts, and an Ed25519 cryptographic signature ensuring tamper-proof results.
FP32 Evidence Report
dc2e9f67
SHA-256: 0a1baffb1197...
INT8 Evidence Report
875d3c6f
SHA-256: ef1d360ccc60...
Methodology
Models were uploaded to Qualcomm AI Hub as ONNX format and compiled for the target Snapdragon 8 Gen 3 chipset. AI Hub ran on-device profiling and returned inference time and peak memory metrics. EdgeGate consumed these metrics, evaluated them against configured gates, and generated signed evidence reports.
Limitations: This benchmark covers a single model architecture (person detection CNN) on one device family (Snapdragon 8 Gen 3). The INT8 variant uses channel reduction rather than true post-training quantization. Single test configuration was used.
Run your own benchmark
EdgeGate tests your models on real Snapdragon devices with automated quality gates and signed evidence. Free tier includes 10 runs/month.