A proposed template for more secure IoT edge archi...

JayThvV · ‎07-27-2016

You will hear more about this and be able to discuss the content in this blog during TechEd 2016 in the SEC205 - Recommended Practices for Security in Industrial IoT Scenarios session. I am looking forward to seeing you there if you like to know more, would like to discuss specific scenarios, and especially if you disagree and have a strong opinion about this, because I definitely want to talk to you. Hope to meet you there!

In a previous blog post, I shared some general recommendations on how to approach industrial IoT edge security on a high level. In another, I stressed the importance of client certificates for device identity. I'd now like to tie it all together in a proposed architecture template that should be relatively widely applicable in industrial IoT, while being flexible enough to allow for variation for unique use cases and scenarios, and dictated by communication channel or physical location.

As before, I feel that the lessons of 2-3 decades of security in traditional IT have relevance in an IoT scenario. We don't have to reinvent the wheel. There are best practices from IT security that can be applied, and given that the nature of M2M communication is so much more predictable than human-to-machine interaction, we have even more tools available to us to provide an edge infrastructure landscape that we can consider reasonably secure.

Assumptions

I do want to be clear about where this applies, and where this might require major adjustment. I am making a number of assumptions here, without which this template is likely not easily applicable:

The device infrastructure is 32-bit or above
Devices run Linux or a Linux variant (or other reasonably feature-rich OS)
Network and communication channels are completely or mostly controlled (LAN, private networks, etc.)

The reason for these assumptions is that we need to:

Have a device infrastructure that is capable of PKI cryptography
Make use of existing security features and capabilities within the Linux OS
Have some level of control over the network the data traffic occurs in

IoT Edge Security Landscape Template

The full landscape is displayed in the diagram below.

Let's talk through the various components that make this up.On the very right, we have the SAP IoT Solution, running on-premise or in the cloud. This is where the data from the edge is received. This environment should be protected with standard application, OS and network security, generalized under "Enterprise Security" (or cloud, in case of cloud). This is not the focus of this article.Instead, let's look at the components to the left of it.

Client Certificates

In this landscape, the devices each have their own X.509 PKI client certificate. There is a certificate server that can be contacted by both the device and the backend server to verify server and client certificates to guarantee mutual trust. It is important that the device itself contains the certificate to ensure an end-to-end trusted communication. Certificates could also be placed on the gateway, but that is not ideal, as it causes a trust issue between the device and gateway. If we are concerned that we guarantee that the device data is indeed coming from the device we believe it comes from, placing the certificate directly on the device itself is the way to go. If the device infrastructure doesn't allow it, make sure that the communication between device and gateway (usually doing protocol translation) is secured.

Of course, we need to have a mechanism to suspend and revoke certificates as well as replace them, and to ensure the latter the device also should be capable of OTA administration and updates.

Aggressive Firewalling

In traditional host firewalling of clients, we tend to block traffic initiated from outside (with some exceptions, like ping, for instance), but allow any traffic originating from the host itself - which makes sense, as it is often hard to anticipate where a human user wants to go. We may blacklist certain websites, for instance, but that is always patchy. In M2M, we don't really have that problem, and we should know very well in advance what end points and other services our devices should be communicating with.

That means we can be much stricter in our firewall rules. Our device never needs to go to google.com or amazon.com, let alone some remote server in another country. We can actually block traffic to any location other than a series of white listed end points and services: our data ingestion end point, a certificate server, a time server and DNS, and perhaps a handful of others. This is much stronger than blacklisting, and since firewall rules require root level privileges to set up, even physical access (but not root access) to the device itself by an attacker would not be sufficient to change that.

Solution specific non-hierarchical DNS

Again, since we know what our devices should be communicating with and they don't ever need to go to random websites, we can be quite strict with what host names we allow to resolve. A solution-specific DNS can be set up that only contains entries for the relevant components in the landscape. Non-hierarchical, since it does not need to go look for resolutions it doesn't already have itself from another DNS server. Alternatively, you could provide a host file to each of the devices, but a DNS server allows some flexibility in the landscape for maintenance, fail-over, fault tolerance and load balancing, without requiring updating the entire device infrastructure.

Network Access Control, Software Defined Networks, Intrusion Detection

Networking is increasingly getting more intelligent, with two key aspects relevant here: rogue host identification and quarantining, and policy and profile based networking. The first allows us to keep unknown hosts isolated and out of the environment, unable to connect to resources on the network. Even just detecting that an attempt to join the network from an unknown device has occurred is important information, as it would be the first step in an attempted attack. The second aspect of profile and policy based networking means that we can be quite strict about the nature of the data traffic, as it will again be quite predictable where the traffic can go (so anything else raises an alarm) and what the nature of that traffic is (if for instance the data size increases dramatically, that would be unexpected behavior).

Honeypots pretending to be devices

To further enhance our ability to detect potential attacks is the use of honeypots impersonating real devices. Honeypots are specifically configured hosts that look like normal hosts, but instead provides a sandbox that "traps" the attacker, but typically slightly more attractive. The idea is that it attracts attackers and hopefully keeps him busy for a while, allowing us to capture their IP address, what actions were attempted, etc. This will tell us first of all that an attack is taking place, and second of all what the nature of the attempt was. We can also use the harvested information to block that particular IP address.

Monitoring and Alerts: SIEM

All components in the infrastructure should be monitored and reported up into a central SIEM solution. Such monitoring can happen with or without on-device agents as well as network monitoring, depending on the monitoring solution used. The honeypot infrastructure should also report into the SIEM, so that a centralized holistic view of the security status of the environment exists that a response team can take action on.

Closing

We are still in the early days of IoT and Industrie 4.0, and there is really not that much practical concrete experience in securing IoT environments yet, so I would expect the template to evolve over time, get richer, or more concrete and defined. We will no doubt run into scenarios where this template needs to be significantly adjusted, whether because of the capabilities of the device infrastructure or the communication channel we are using, and we may need to have multiple templates accordingly. However, I hope at least it is a starting point to have a general sense of what components we should take into consideration to protect IoT edge landscapes, and the solutions they feed their data to.