DNP3 SAv5 and TLS: Different trust boundaries
Historical Note: This post originally appeared on Automatak.com. Subsequent analysis under a DHS grant, changed my opinion on DNP3 SAv5 substantially. There is a good paper published by IEEE S&P available here that I co-authored with Sergey Bratus that better summarizes my technical opinion of DNP3 SAv5.
The purpose of this post is not to compare the merits of SAv5 vs. TLS, but rather to point out how the security concept of trust boundaries is applied to the analysis of dnp3 implementations themselves.
Distributed Network Protocol (DNP) has an application layer authentication standard called Secure Authentication (SA) that will soon be widely implemented in critical industrial control systems. It uses a challenge-response architecture and secure hashing to ensure that critical messages like controls are authenticated before they are processed.
IMO, the three main benefits of putting the authentication at the application layer are:
- Transport neutrality: It works on a serial port or spread spectrum radio
- Granularity: Individual agents can be authenticated leading to an audit trail
- Bandwidth reduction / flexibility:Not all data need be authenticated. Only those message deemed critical.
Using SA does not preclude using Transport Layer Security (TLS). The two can work together. Even though they do different things in different ways, they both provide a duplicate function: ensuring that critical data comes from a trusted source. If you think of all the stuff they do, you can crudely summarize their relationship with the following venn diagram above.
The standard has been vetted by security experts and like everything that that DNP3 technical committee puts out, is well thought-out and exhaustively documented. My concern, however, is that asking vendors to implement SA is not itself a recipe for secure DNP3 devices. I believe a lot of people know this already, but without specific warnings, it may give vendors and utilities a false sense of confidence if the proper security testing is not done by companies that implement the protocol in their devices.
It comes down to the security concept of trust boundaries and how much software surface area is exposed to unauthenticated data. Hackers don’t need to break your authentication mechanisms to compromise a device if they can exploit a bug in software that’s outside your trust boundary. With SA, the DNP3 stack itself is outside the trust boundary. With TLS, the stack is inside the trust boundary.
A utility that is solely relying on SAv5 for authentication should hold DNP3 vendors to a higher standard than if they are using TLS. Conformance testing (and SAv5 testing) is not a sufficient criteria for selecting secure hardware when solely running SA because so much software is outside the trust boundary. If the utility runs a good implementation of TLS instead, (or in addition) they can sleep a little better at night without doing security audits on their DNP device themselves. Libraries that implement TLS like OpenSSL are rigorously punished with tools like fuzzers to lower the probability of an exploitable defect and have the eyes of hundreds of security experts on them. If TLS is filtering out all non-authenticated messages, you don’t have to hold the DNP3 implementation to such a high standard.
DNP3 is a very large protocol. Modbus is 70 pages and has very little state to track. DNP is over 1,000 and is incredibly stateful. It’s commonly accepted that defect rates in software are proportional to the size of a code-base and that stateful software is harder to implement correctly than state-less systems. Therefore, it is not unreasonable to assert that it’s almost 15 times more likely that a programmer will miss a corner case while developing DNP than Modbus.
The only mitigation to this type of vulnerability is to insist that any software examining unauthenticated data is ‘white hat’ tested with tools like protocol fuzzers to lower the probability of an exploitable defect. TLS provides protection against software defects in the DNP stack itself, whereas the sole application of SA does not.
Having a large surface area of code outside a trust boundary that has not been rigorously tested is a recipe for disaster. Testing negative (malformed) inputs is critical in any software outside a trust boundary. Fuzzing offers a complementary approach to code inspection and units tests.