It’s Halloween, a night to hug your servers tight and fear the hardware that goes bump in the night. This year, our team wanted to explore the dark chapters of IT professional careers to uncover some of the real-life nightmares they’ve had to endure. We’ve put together five of these mysterious and horrific stories that will send chills down your spine and make you want to run screaming from the data center.
The Phantom Room
Years ago, an IT team had to open up on a new floor and put in a whole new data center. Based on the building plans, nothing was in place, so they were looking forward to working with a blank slate. When they got there, however, the rooms themselves were locked … and they were not empty. Instead, inside of each room, there was a whole set of other walls.
When the team broke through these walls, they found that all the rooms were full of computer equipment that was top-of-the-line when new, but was now quite old. Members of the IT team who had the skills for implementing and managing this equipment had retired years ago, and the equipment itself had been running for years. In addition, all the wiring was overhead, which was why it had not appeared on the building plans. It was as if the room had never existed. While this did break some building regulations, the team had two things to be thankful for:
- They hadn’t damaged any of the wiring when they had to break into the rooms.
- The equipment itself was supporting several of the company’s key production applications, so it was a relief that it had not had any major failures or maintenance issues!
A Rat’s Nest of Cables
A CIO attended a demonstration of web server appliances that were new at the time. He was attracted by the blue lights, but he realized that the move to centralize thousands of servers would be a big migration project. More than 3,000 web servers would be affected, along with all the attendant networking.
When we started clearing the data center out, we had to remove five separate kinds of network equipment: Ethernet, Token Ring, ISDN, HIPPI, and SPI. What made this even worse was that generations of animals had made their homes in that equipment—so we were pulling out dead animals all the way through. Whenever someone complains to me about a rat’s nest of cables, I always think of that project.
On top of all the networking changes, we also found that we couldn’t fit the uninterrupted power supplies (UPSs) into each rack to support the web server appliances.
You’re Never Seeing Your Data Again!
A big migration project was taking place when the company moved to a larger storage array.
This would not have been a problem except for the fact that the encrypted data was moved over to the new array at the same time as the encryption key. Both of these were on the same array, at the same time … and the new array promptly failed. As a result, we couldn’t get the key back, and the data was completely inaccessible, never to be seen again! A big lesson learned!
The Patching Trail to Hell
One of the organizations that I used to work at had over 400 instances of Microsoft SQL databases across its enterprise. These were geographically dispersed across the United States and run by different departments across the organization.
At one point, an SQL Injection attack was launched by malware writers. It was based on a known vulnerability that Microsoft had released a patch for; however, about three-quarters of the database instances that were in place in the organization were not patched. As a result, the organization experience about a 90% infection rate. You can do the math: three-quarters of over 400 means more than 300 machines, and 90 percent of that equates to over 270 stricken servers! It was a scary problem.
Why was this allowed to happen? It wasn’t that patching couldn’t take place — most of the teams were trying their best to keep up to date or put patching windows in place, and there was a set of rules in place for backup and disaster recovery. However, some of those MS SQL databases were used by various backup solutions across the organization as media and catalog servers. This scenario not only prevented the actual production databases from being recovered, but the backups could not be recovered either. This big attack had admins across the organization scrambling around to see if there were local backups or database snapshots available to plug those gaps and get back up and running. A database nightmare for sure!
We were working on a business continuity strategy, and in addition to the traditional IT failure
scenarios that everyone has to deal with, our geographic location makes us subject to hurricanes, flooding, and tornadoes (and we’re not ruling out a zombie apocalypse either). This is something that we have to plan for because these events have happened already in the past, and we know they will happen again (except the zombies, still waiting for that one).
When I looked at the upgrading our backup systems, I could not wrap my head around trying to recover an on-premises backup system, mission-critical applications, and end-user data, all at the same time and in the middle of a natural disaster. It was just too scary for us to continue down that path. Instead, we decided to start using the cloud and shifted over to a “data management as a service” (DMaaS) approach. This was a no-brainer for me because it meant we could sleep through the night and avoid having to fix problems in the middle of a hurricane (or in case of zombies).
Be afraid … be very afraid