An auto-discovery pattern: eliminating hardcoded device-to-site mapping
We were trying to solve a fundamental issue in our system. It looks simple when written down, but it didn’t feel simple at all when we first started working on it.
In our system, there are around 50–100 IoT devices constantly pushing data. Each device sends more than 10 messages per second. At any given moment, at least 30 devices are actively publishing telemetry.
So even at a minimum level, the system is always receiving a steady stream of messages.
A typical message is published to the broker on a topic like:
system/locationX/device1
And the payload looks something like this:
{
"data": {
"amps_a": 0,
"amps_b": 0,
"amps_c": 0,
"battery_voltage": 12.2,
"coolant_temp": 0,
"engine_hours": 485.15,
"frequency": 0,
"kvar": 0,
"kw": 0,
"kwh": 5869.8,
"not_in_auto": 0,
"oil_pressure": -0.72,
"power_factor": 0,
"rpm": 0,
"volts_ab": 0,
"volts_bc": 0,
"volts_ca": 0
},
"device": {
"id": "deviceX",
"serialNum": "serial"
}
}
The data itself isn’t the problem here. The real issue is where this data is sent and how the system understands what this device belongs to.
For every location, there can be multiple devices. In the example above, I’ve shown just one device in one location, but the real system is much more dynamic.
What’s the problem, really?
Inside the application (our backend system), devices are usually identified using internal identifiers. These identifiers are often database primary keys or some form of system-generated id.
This leads to a few common approaches.
1. ID-based device identification
In this approach, each device needs to know the database id of the site or system it belongs to. That means the device must be synced with information from the backend.
This is effectively a system → device sync.
It works, but it introduces friction:
- Devices must be provisioned carefully.
- Any mismatch between database ids and device configuration breaks ingestion.
- Replacing or resetting devices becomes painful.
- Environments (dev, staging, prod) need separate configs.
The system becomes tightly coupled to the device configuration.
2. String-based identifiers
Instead of numeric ids, we use string identifiers. These are easier to read and remember, especially when ids are auto-generated.
This is slightly better, but the core issue remains:
- Devices still need prior knowledge about the system.
- Configuration mistakes still happen.
- The mapping logic still lives on the device side.
So while this improves readability, it doesn’t really solve the problem.
Rethinking the problem
Instead of asking “How does the device know where to send data?”, we flipped the question.
What if the device doesn’t need to know anything at all?
Devices already publish messages. They already have topics. They already send identifiers like serial numbers or device ids.
So we let devices publish messages as they normally would.
On the system side, the MQTT client simply listens to everything and writes all incoming topics into a table. At this stage, the system does not try to understand or process the data fully.
This becomes an auto-discovery phase.
The system passively discovers:
- new devices
- new topics
- last seen timestamps
- sample payloads
No hardcoded mapping. No pre-configuration.
Assigning devices to sites
Once topics are discovered, we still need to answer one question:
Which site does this device belong to?
This is where a human-in-the-loop step makes sense.
Newly discovered devices start in an unmapped state. They appear in an admin dashboard where someone can inspect them and decide what to do.
The lifecycle looks like this:
-
Unmapped
- Device is discovered.
- Messages are ignored.
- Device appears in the admin UI.
-
Mapped
- Admin assigns the device to a site.
- Messages start getting processed.
- Data is stored normally.
-
Ignored
- Device is marked as test, invalid, or noise.
- Messages are discarded.
This keeps ingestion safe by default and avoids accidental data pollution.
stateDiagram-v2
[*] --> Unmapped: New device discovered
Unmapped --> Mapped: Admin assigns to site
Mapped --> Unmapped: Admin removes mapping
Unmapped --> Ignored: Mark as test device
Ignored --> Mapped: Re-enable & map
note right of Unmapped
Messages ignored
Device appears in
admin dashboard
end note
note right of Mapped
Messages processed
Data saved
end note
note right of Ignored
Messages discarded
end note
The state flow looks like this:
- New device → Unmapped
- Unmapped → Mapped (admin action)
- Unmapped → Ignored
- Ignored → Mapped (if re-enabled)
erDiagram
DEVICE_MAPPING ||--o| SITES : "maps to"
DEVICE_MAPPING {
string device_id PK "sensor_alpha_42"
int site_id FK "147"
enum status "mapped | unmapped | ignored"
}
SITES {
int site_id PK
string name
}
Data model (simple and intentional)
At the database level, the mapping is straightforward.
Each device has:
- a device identifier
- an optional site id
- a status (mapped / unmapped / ignored)
The site table remains unchanged.
This separation makes it clear:
- devices exist independently
- mapping is an explicit decision
- processing depends on status
Why this worked better for us
This approach removed a lot of hidden assumptions:
- Devices no longer depend on database ids.
- Provisioning became easier.
- New devices can be added without touching backend configs.
- Mistakes are visible and reversible.
- Test devices don’t interfere with production data.
Most importantly, the system became observable first, instead of configured first.
Closing thoughts
This pattern won’t eliminate all complexity, but it moves it to a place where it’s easier to manage — the system side, not the device side.
By separating discovery from identity assignment, we made the system more flexible and safer to operate at scale.