Stuxnet, and the Case for Cybersecurity in Critical Infrastructure
One of the experiences you get when you are brought up in a war-ridden country is the constant paranoia of the parents, who expect dangers from all around. This was the case of my upbringing during the 1990s and 2000s when the civil war between Sri Lankan State and the home-grown terrorist organization LTTE was escalating rapidly.
LTTE was different from other terrorist organizations at that time due to its targeting of the critical infrastructure of the state using varying methods of bombings. Notable examples include the bombings of Kolonnawa Oil Tanks (1995), Central Bank (1996), Katunayake International Airport (2001) and air bombings of Kolonnawa Oil Tanks (2007), Muthurajawela Gas Storage (2007), Kelanitissa Power Plant (2008). I remember the air bombings in 2007 and 2008 very well as the there was a safety power-cut at the middle of the night until the air raid was over.
These experienced brought up the level of fear among my parents regarding our close proximity of the Sapugaskanda Oil Refinery, which was the only oil refinery at that time for the entire country. What they feared was not the bombing itself, but a subsequent release of poisonous gas clouds or fire to the surrounding areas. Later in my life when I did my major in Chemical Engineering, I was actually able to study about these scenarios and their impact radii, etc.
Regardless of how much the general public feared such an attack, arranging the logistics of such an attack was difficult and costly. However, with the growth of the internet, this is not the case today.
Industrial Automation and PLCs
Industrial automation, born perhaps with the invention of the assembly line, is focused on automating the works of an industrial setting, which is usually focused on automating the monotonous, physical work that does not need a significant level of intelligence. The level of automation and the methods for is perhaps in sync with the industrial revolutions. With the 3rd Industrial Revolution onward, even non-physical work started getting automated.
You might be perhaps familiar with the steam engine and assembly line, being very basic concepts of automating physical work in the 1800s and early 1900s. These industrial revolutions and the innovative products that resulted from them, coupled with cheap resources and markets from colonies enabled the western world to launch towards a market economy that improved living standards drastically. These two industrial revolutions are very well documented and sometimes taught in high school curricula in contrast to the 3rd Industrial Revolution.
The 3rd Industrial Revolution was propelled with the development of semi-conductor devices that enabled not only to automate mechanical work as in the previous two industrial revolutions, but also the electrical work that traditionally used relays, switches, etc. This was extremely important to drastically reduce the cost of automating the mechanical systems and carrying out further modifications without having to redesign from scratch.
For example, if you have automated your elevator with few relays and switches, you have to rebuild the wiring and electrical work if the factory decided to add an additional platform (floor). However, with Programmable Logic Controllers (PLCs), you only have to change a few lines of programming code to make the elevator compatible with having an additional platform.
Programmable Logic Controllers, or PLCs, are simply what is says; programmable, logic-based, controllers.
If it is deconstructed, controllers are simply the control element of a control system. If you are building a motion detected lighting system, you have to have 3 parts to that; a sensor, a controller and an actuator. In a lighting system, the sensor would be a thermal sensor that detects human presence or movement; the controller would be a circuit or something more complex that the logic of the system would be built in, and the actuator would be the lights. The end result would be the controller sensing the presence of a human through the sensors, and turning on the switch to turn on the lights. This is a very simple control system.
What PLC brought into the game was the ability to program the controller without changing the circuitry, or electrical system associated with it. This was similar to how Unix changed software development processes, allowing the developers to program for varying hardware with a single language, reducing the need for specialization. PLC was able to replicate that disruption in the controller hardware industry, which made the modifications, additions, and redundancies to existing control systems easy, without incurring CapEx (capital expenditure).
Modern PLCs are programmed using the proprietary OEM software that comes along with the system. This software incorporates graphical programming interfaces such as ladder programming that enable automation engineers with limited programming knowledge to program the PLCs that will automate the connected hardware. In a factory setting, combinations of PLCs are connected using SCADA (Supervisory Control and Data Acquisition) systems that are also programmed using OEM software provided by the system manufacturers creating an ecosystem of Operational Technology software.
The biggest jaw-drop comes when we analyze the security of this software, with it being developed for engineers by OT software developers. By some, the vulnerabilities of the software have been thrashed as Insecure by Design, especially when looking at the access privileges and protocol vulnerabilities. There are significant amounts of vulnerabilities reported on leading OEM software vendors that question the very competency of the hardware giants to develop secure OT software. It is these vulnerabilities combined with OS vulnerabilities that were exploited by Stuxnet to carry out massive damages to selected critical infrastructure.
Stuxnet was allegedly developed by a joint US/Israeli effort to damage Iran’s nuclear infrastructure, especially Natanz nuclear facility. The objective was to delay and hamper Iran’s ability to develop nuclear material that can be used for military applications. Despite none of the US or Israeli defense establishments accepting the role of creating Stuxnet, Israel’s proficiency in cyber security and their national security objectives gives a compelling argument of their involvement.
Stuxnet was developed as a computer malware that only attacked SCADA systems that were developed by Siemens, the German industrial devices giant. The malware was designed to exploit zero-day vulnerabilities in Microsoft Windows operating system, and the software of Siemens, SIMATIC STEP 7 and SIMATIC WinCC. In terms of Microsoft Windows, the creators of the virus exploited 4 zero-day vulnerabilities of Microsoft Windows to spread. The main objective of Stuxnet was to increase the speed of the Iranian nuclear centrifuges at Natanz, resulting in a melt-down, thus damaging the nuclear infrastructure.
It is important to note that most operational technology systems of modern critical infrastructure are built with direct cyber attacks in mind, thereby air gapping the systems in most cases. What it means is that the local networks of SCADA systems are not connected to the unsecured systems such as the Internet. This prevents a direct remote cyber attack without the engagement of a physical agent impossible, thus reducing the vulnerability of the system. This was taken into consideration by the developers of Stuxnet.
Stuxnet mainly had 3 components that worked in sync, a worm to deliver the payload, a link file to replicate the worm, and a rootkit to hide all the malicious code. The malware famously exploited the Windows shortcut vulnerability where the malware is spread to removable devices such as flash drives; which even I have personally been victim to.
The sophistication that has been given to design Stuxnet makes it interesting to study is how it affected Natanz nuclear centrifuges. A rough idea of what happened is as follows:
- Stuxnet spreads to millions of devices through the internet, infecting computers and copying itself to the removable devices such as USB flash drives.
- Stuxnet malware infects the computer of the maintenance engineer through the USB flash drive. Since an air gap is installed to block direct cyber attacks by the external networks to the internal network of the Natanz, this was the only way such an infection was possible.
- The malware is executed in the local host computer without any indication and replicates rapidly within the local network exploiting a Windows network vulnerability.
- The malware has found the control computer running Siemens software and has infected its configuration files. There are varying reports of this software being SIMATIC STEP 7 — the Siemens PLC software or SIMATIC WinCC — the Siemens SCADA software. The infection results in malicious lines of code being executed by the system.
- The codes change the programming to increase the centrifugal speed of Natanz centrifuges thus controlling the hardware. These lines of code are said to be executed once in 27 days to make it undetectable.
- Codes change the output of the system to hide the increased centrifugal speeds. For example, if the centrifugal speeds are increased from 10,000rpm to 15,000rpm over a period of 3 months, the output from the SCADA system would only display 10,000rpm as the current centrifugal speed. This is to increase the damage to the infrastructure by delaying the date of discovery.
The complexity of Stuxnet resulted it being named the world’s first digital weapon.
Despite how good Stuxnet was designed, in its payload it is simply a logic bomb; a malware that is executed only when logic is met, in this case, the control computer of a Siemens S7–400 PLC, running SIMATIC WinCC and SIMATIC STEP 7 software. This was the configuration at Natanz nuclear centrifuges, but not only there. The malware ultimately affected 115 countries damaging thousands of industrial equipment running the machines with said configuration.
Security of Industrial Systems
The United States Department of Homeland Security, the agency created post-9/11 to coordinate the national security apparatus identifies 16 different sectors as Critical Infrastructure Sectors.
It is quite important to note that the top tier of the critical infrastructure needs to be fail-safe and fail-secure; as any damages to the infrastructure can result in a direct loss of human life even if we ignore other negative externalities like damages to the environment, wildlife, etc.
It is also important to note that almost all of the above sectors use industrial systems at different levels. These may be the direct operations of the sector in the top tier and a middle tier or utility operations (air-conditioning, water, power) in the middle tier and bottom tier. These systems will be mostly designed by the reputed manufacturers such as Rockwell Automation, Siemens, Schneider Electric, etc. except for some specialized operations in certain industries or limited vendors in defense establishment; the similar type of vendors targeted by Stuxnet.
Most of the industrial systems are run on Windows and OEM software vendors have a tendency to not even provide Linux-based versions of the software. This is problematic as the Windows systems are more vulnerable due to the closed source nature of the OS. This is evident from the vulnerability assessments of modern operating systems; Windows 10 has 712 total vulnerabilities with 134 critical vulnerabilities while Linux kernel has 604 total vulnerabilities with 0 critical vulnerabilities. (Scores of 9 to 10 are considered as Critical Vulnerabilities.) These figures are of course ignoring any zero-day vulnerabilities.
IoT and the Nightmare of Things
One of the main mistakes that are done by IoT startups is the lack of diversity when it comes to R&D teams. The startups are purely composed of software engineers and electrical engineers without any regard to safety engineering, systems engineering, two key general engineering practices that should be considered when connecting industrial systems. I have seen two types of people setting up industrial IoT systems that should not. Those are:
- Hardware engineers with no understanding of cybersecurity concepts that program flawed IoT software that can be easily exploited.
- Software engineers with no understanding of hardware that try to connect software to hardware directly.
The internet is full of articles of connecting PLC devices with embedded devices like Arduino and Raspberry Pis, and even some cases using Raspberry Pi modules as a substitute for PLC systems to reduce the cost and improve the programming aspects. Most of these developments do not go to industrial markets, but limit to DIY inventions; but how many factory engineers use these unsecured devices for industrial control? We do not know. This is one of the reasons why secured IoT operating systems like Ubuntu Core that limit the access to GPIO ports, network ports should be developed.
Going beyond industrial applications, IoT systems such as smart cities, smart transportation, smart grids, etc. need to be Secure by Design so that any vulnerabilities in hardware, firmware or software is difficult to be exploited by hackers with malicious intent.
I’m glad that LTTE is not in an age where Sri Lanka would be fully digitized. The ease of disruption by exploiting cyber vulnerabilities would have been super easy for any such terrorist organization that would have made my parents’ biggest fears lame in comparison. Interestingly enough, we have not seen any cyber terrorist organization — apart from a few bad actors and government actors — committing cyber attacks on the level of Stuxnet.
What can we do to avoid such a future? I have a few ideas that I think any organization developing critical infrastructure software should try.
- Secure by Design
Consider security of the systems at the design stage itself; not limiting yourself to base your decision on performance and reliability. One thing that has been working is having a risk-based approach to software design for critical infrastructure where every step of the design decision you make is scrutinized for cybersecurity risk.
- Diverse developer teams
Safety engineering, systems engineering did not fall to Earth from space; they were developed as a response to requirements that were faced by traditional companies. Some of these disciplines that have been maturing for over a hundred years have developed some practices that software companies can adopt. So the next time you are building software to control oil pipelines flow, hire a Pipeline Engineer.
- Delayed adoption of new technologies
Some technologies are not secure from the outset, and this applies to technologies even that are developed for security like blockchain. You can rollback systems if the new technology that you used has critical vulnerabilities, but how practical that would be in complex systems such as these? Better to be late than sorry IMO.
- Multi-system security
Modern systems should not be secured only by one system, but by multiple systems. If you are automating a hydro-power plant, use multiple systems (electrical, mechanical, cyber, etc) for controlling the release valve. One thing to note that these systems are best if they are in isolated loops, therefore no point of using both a biometric scan and a PIN number for your banking system if they are using the same network for verification.
- Good old penetration testing
Sometimes it is just that. Hire some very good penetration testers to exploit the vulnerabilities of your system.
Happy developing! :)