OT: Potentially largest IT outage in history today

Trojanbulldog19

Well-known member
Aug 25, 2014
8,868
4,348
113
Jurassic Park Hold Onto Your Butts GIF

Pretty much every IT department today.
 
  • Like
Reactions: patdog

The Cooterpoot

Well-known member
Sep 29, 2022
4,166
6,761
113
Was it as bad as the Scott's Tissue add that sucks the life out of this site when it shows up?
 

QuadrupleOption

Well-known member
Aug 21, 2012
1,012
804
93
Expect to see more of these type of events in the future. With there being a push to cloud for all SaaS apps and cloud hosting, the We are an interconnected global economy now. Everything relates to the other, and causes a domino sequence.

The only way to combat these type of failures is to have better planning, testing backup plans, etc. While not realistic, the best way to plan to to have a fleet of offline devices at all time, have a copy of all data offline (Airgapped), and have multiple vendors for different products. Have ATT and Verizon, have Mac and Windows OS. Have multiple EDR Solutions, etc.
QA is bad in many cases, but most (if not all) of the big-boy SaaS operations run containerized systems that are easily spun up/down to handle extreme demand, are geo-redundant, and can be rolled back quickly if a software update goes wrong.

This issue was due to poor QA pushing out a patch that disabled individual laptops. I assume most IT shops had their servers up and running quickly after this occurrence.
 

MSUGUY

Member
Oct 11, 2020
346
199
43
Are the effects of the Crowdstrike problem limited now or is this the beginning of something more serious?

My job is canceled tomorrow because my Clients are stuck out of country with no flights. I have friends that can’t return from the Caribbean due to no flights.
 

Willow Grove Dawg

Well-known member
Nov 3, 2016
5,750
1,461
113
I flew JAN to ATL Friday & returned Saturday on Delta
Friday flight scheduled for 6:07 AM departed Jackson at 9:30 AM.
Saturday return was scheduled for 1:15 PM and departed about 4:00 PM
I was lucky because I did not have any connections. I thought Delta managed the situation as well as possible given the circumstances because they had very little information available to them especially Friday morning.
Atlanta Hartsfeld was a complete disaster both days though. I could not imagine the number of people in the airport. They weren't any hotel rooms or rental cars available, so there were literally thousands of people sleeping in the airport with flights delayed by days if not cancelled. The walkways between the terminals looked like a homeless camps.
 
Last edited:

Boom Boom

Well-known member
Sep 29, 2022
1,942
1,091
113
Expect to see more of these type of events in the future. With there being a push to cloud for all SaaS apps and cloud hosting, the We are an interconnected global economy now. Everything relates to the other, and causes a domino sequence.

The only way to combat these type of failures is to have better planning, testing backup plans, etc. While not realistic, the best way to plan to to have a fleet of offline devices at all time, have a copy of all data offline (Airgapped), and have multiple vendors for different products. Have ATT and Verizon, have Mac and Windows OS. Have multiple EDR Solutions, etc.

I will admit this one is a new one no one has seen before. The main issue with this event was that it required boots on the ground for physical endpoints. This wasn't a situation that was isolated to a single organization like a typical Ransomware event where you could bring in an IR firm on reinforcements.

You can rest assure that our adversaries (China, Russia, Iran, and North Korea) has taken note. The best way to have the biggest impact is to infiltrate the "supply chain". An example of this was back when SolarWinds was compromised via updates a few years back. You hire a developer and gain trust in the software development process, you get the access you need and learn the ropes of the approval processes. You learn the culture and determine the checks and balances, then you slip in a little code over time and have it deployed.

While this wasn't a compromise, it was similar in that a single piece of software used global by all organizations was impacted.

Imagine having the ability to remotely "kill switch" all devices (Windows, Nest, iPhone, etc.)

One day this will occur, and when it does all hell will break loose.
The problem is ample QA hurts margins, so corp America hates it. Maybe they're not as bad about it as manufacturing in America has gotten. Yet.

That's more info on SolarWinds than I've ever seen publicly reported. It's like the media isn't allowed to talk about it....
 
  • Like
Reactions: patdog

00Dawg

Active member
Nov 10, 2009
3,043
272
63
QA is bad in many cases, but most (if not all) of the big-boy SaaS operations run containerized systems that are easily spun up/down to handle extreme demand, are geo-redundant, and can be rolled back quickly if a software update goes wrong.

This issue was due to poor QA pushing out a patch that disabled individual laptops. I assume most IT shops had their servers up and running quickly after this occurrence.
It disabled any computer using Crowdstrike that was powered by any of several editions of Microsoft OSs, including all the major PC and server editions still in use, at least back through Server 2008.
Took us about 12 hours to get things 100% running again, although we were never 100% down because not all of our servers' Crowdstrike installs had updated by the time Crowdstrike pulled the update down. I still have at least one team member whose laptop will have to be reimaged or replaced, and this is a guy with two decades of IT experience.

Meanwhile, a UAB computer forensics went on the local news here and said Crowdstrike fixed the issue by sending out another update. That was definitely incorrect. Any computer that got the update required some kind of intervention to run again, be that by rolling back to an earlier backup or by manual deletion of the file causing the issue; impacted computers couldn't get online to receive another update.
 
Last edited:

MSUGUY

Member
Oct 11, 2020
346
199
43
The problem is ample QA hurts margins, so corp America hates it. Maybe they're not as bad about it as manufacturing in America has gotten. Yet.

That's more info on SolarWinds than I've ever seen publicly reported. It's like the media isn't allowed to talk about it....
I think this is what happened to MGM last year, they refused to pay for IT security upgrades and eventually got a ransom ware attack which they opted to not pay. They shut the whole system down and started from scratch.
 
  • Like
Reactions: patdog

onewoof

Well-known member
Mar 4, 2008
9,704
5,832
113
The fact that most of the world runs on Microsoft Windows... says a lot
 

Anon1704414204

Well-known member
Jan 4, 2024
880
727
93
I find it amazing that for some 10K years the horse was the fastest mode of transportation till some 175 yrs ago the steam locomotive showed up. Since then we've gone to the moon and now AI when only a little while back the Pony Express was cutting edge. Makes me think of Elton John's song "Country Comforts" ...."Down at the well they got a new machine... Foreman says it cuts manpower by 15...oh but that ain't natural Old Clay would say cuz he's a horse drawn man until his dying day."

 
Get unlimited access today.

Pick the right plan for you.

Already a member? Login