Worldwide Tech Outage Started with Defective Crowdstrike Update to Microsoft Windows
An issue with a commonly used security software called Crowdstrike shuttered large technology systems around the globe, including airlines, transit systems and stock exchanges
The following essay is reprinted with permission from The Conversation, an online publication covering the latest research.
A major IT outage has hit businesses across the world, grounding planes as well as affecting banks and the healthcare sector.
On supporting science journalism
If you’re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.
George Kurtz, CEO of IT security firm Crowdstrike, said it had traced the issue to a âdefect found in a single content updateâ for the security software it provides for the Microsoft Windows operating system on computers.
Microsoft said the issue was caused by an âupdate from a third-party software platformâ and that the âunderlying causeâhad now been fixed.
The Conversation spoke to Professor Alan Woodward, an expert in cybersecurity at the University of Surrey, about what went wrong and how the problem could be resolved.
Can you explain whatâs happened here?
I think there are two things. First, Microsoft seems to have had a problem with its Azure cloud computing platform. Itâs a bit unclear, but there was a degree of degradation in that service starting in the evening of 18 July. However, it didnât fail altogether.
But by far the bigger problem seems to be an update that appears to have been done in the late evening of July 18 for [IT security company] Crowdstrikeâs Falcon product â a computer threat checker. Falcon works by having some âagentâ software deeply embedded in the operating system of every PC, which monitors that computer and âcalls homeâ if thereâs a problem. It also receives updates on what to look out for if thereâs a threat. Itâs used a lot by large organisations throughout the world, which have a huge number of PCs to police.
Iâm sure Crowdstrike are urgently investigating what happened. This piece of software is designed to protect people from ransomware attacks and the like. From the latest information Iâve seen, it looks like the update system file was somehow released in an incorrect format.
The Windows operating system gets to this update and it doesnât know how to cope, so it crashes. Thatâs why people have been getting the âblue screen of deathâ [a computer screen with an error message indicating a system crash].
And the big problem is, you canât fix this issue remotely. You have to go into every machine separately and put it into âsafeâ or ârecoveryâ mode to isolate the software. From there, you should be able to reboot the machine and get it up and running again. But if youâre a big global company with a large distributed IT estate, thatâs going to take a long time.
Why has this outage had such wide-ranging effects?
Crowdstrike has been a great success â its security software is used by hundreds of thousands of major clients around the world. So airlines, airports, railways, hospitals, stock exchanges ⊠theyâre all going down.
It started in Australia when they got up for business on Friday. The update had clearly been sent out last night UK time, and it has just rippled around the world.
With deliberate ransomware attacks, theyâll typically take out one or two targets at a time. But in this case, itâs happened to thousands of organisations at once. Weâve not had anything like this before.
How Crowdstrike will fix the software is yet to be determined. As Iâve explained, itâs clear how companies can work around the issue. But for some very large organisations, this could affect their critical infrastructure and business for a long time yet â itâs going to take them days to physically work round all those machines.
Can security companies ensure this doesnât happen again?
Security software is very intertwined with a computerâs operating system â itâs buried deep in there. There has to be a way that if something is found to be corrupted, it doesnât just keep crashing the system â this may have to be done in cooperation with Microsoft, which owns the Windows operating system.
Thereâs got to be some way of backing out of it, and there is. However, most people trying to log into their blank PCs donât know how to put their PCs into safe mode and revert to a previous state.
At the moment, it looks like itâs one corrupted file thatâs producing a global problem. Computers download updates all the time, so how Microsoft prevents that from happening with this update, I donât know. Itâs not immediately obvious. And the million dollar question is: how did this corrupted file get released in the first place?
How long before this problem is fully resolved?
Itâs certainly going to take days, if not weeks. Itâs like those hospitals in London that got attacked with ransomware. Theyâre still suffering â thereâs a very long tail on these things.
And in this case, itâs not just a long tail but a very broad swathe of global organisations in transport, health and everywhere else. I donât think weâve seen anything like this before.
On X, formerly Twitter, George Kurtz, co-founder and CEO of Crowdstrike, commented: âThe issue has been identified, isolated and a fix has been deployed. We refer customers to the support portal for the latest updates.â
This article was originally published on The Conversation. Read the original article.