Code of Cthulhu

by David Cameron Staples

A roleplaying game of eldritch investigations into secrets of which man was not meant to know. Where Call of Cthulhu meets Kult meets System Administration.

GM You are happily sitting at your desk after a pleasant Friday lunch, when you see a notification of an email.

Player I groan and open the mail.

It is a Nagios ticket. CPU warning on a server. You don't recognise what the server actually does.

Sigh. I log in and run "top".

It's pretty slow to log you on. Top line of top shows "java" using 249% of the CPU. Load's at 93. Roll 1d10 SAN loss.

Shit. Um... press "c".

You can see the full details of the Java invocation. (d6 SAN) And you can see the fateful curse "tomcat" mixed in with the strange moon language. Roll another d10 SAN loss.

Dammit. Um, do I have any service owner contact details?

What do you think?

I think I should ignore it and see if it'll sort itself out.

You go and get a cup of coffee. It's good. Recover 3 SAN.

Suddenly you realise there's a Client Relationship Manager standing right behind your chair, breathing heavily. You didn't even hear him approach. d6 SAN loss. He tells you that some critical service is unresponsive, and he's already escalated to your supervisor's boss that you haven't fixed it yet.

Is he the service owner?

No, he's the guy who promised the customer that this service would never fall over. He also golfs with the head of the IT services department.

Shit. I ask him if I have his permission to restart the service.

"What, you want to break it more? Why do we pay you people to break things? You have to fix it now, there are millions of dollars at stake!"

Does he know who is the service owner?

Guess.

Does he know where there's any documentation?

Seriously?

Right. Um. Fuck it. "# service tomcat restart"

tomcat: unrecognised service

Fuck. "# chkconfig --list"

You see only one service which looks like it might be what you're looking for. It's called "data_sqafxz".

"# service data_sqafxz restart"

"Usage: /etc/init.d/data_sqafxz {start|stop|import}"

"Import" WTF? No restart function. OK: "# service data_sqafxz stop"

"Stopping SQaFXZ data service ... ... [OK]"

Right. "# service data_sqafxz start"

"Starting SQaFXZ data service ... ... ... ... ... ... ... ... [FAILED]"
The CRM behind you has started shouting. "What did you do? Did you just break it? Why did you break it?" His phone starts ringing. Another CRM arrives and starts yelling at the first one, then they both yell at you. Your INT is effectively -20 while they're doing this.

Fuck. Logs. There have to be logs, right?
"cd /var/log/; ls -l"

There is a SQaFXZ directory.

"cd SQaFXZ; ls -l"

It appears this package does its own log rotation. Meaning in practice that there is a date stamped log file for each day this service has been operating, and none have been deleted.

What? Why??

One of the CRMs shouts something about "auditing" and "security". The other screams "Access Control!".

How far back do these logs go?

ls -l pipe though wc -l... divide by 365... about four and a half years.

How big is this disk?

df -h says "... 100GB  57%  /var/log/SQaFXZ". Go ahead and roll another d6 SAN.

That's ... what, 30+ megs of log a day, every day for four and a half years?

Yep.

tail(1) today's.

Ten lines isn't nearly enough. You can see that it's all Java error and warning logs, and you can see that something dropped its clogs, but you will need to go further back to see what.
Another CRM has turned up, and he brought your supervisor's boss. They are all yelling at each other and at you. Lose another 10 INT while they're doing it, and have a d10 SAN hit.

less(1) today's log.

Start making INT checks.

Fail.

You see a reference to a failure related to a service not being accessible, but you don't think that's the immediate problem, because that service was turned off three years ago and the hardware taken away by a metaphysical hazmat team.

Hey, pass!

You find a reference to a server not responding, and lots of errors after that where it went crazy trying to reconnect. That's probably where the CPU load went.

I'd better have a look at that other server, then.

^A^C on screen(1), and you're good to go. This server is up, but also taking its time. It's running Solaris.

Yeah, I'll just go ahead and roll ... d6?--

d6

-- d6 SAN loss now. OK, what's making it slow? I run top.

At the top of the screen you see the command "oracle", and a series of dread runes and forbidden incantations taking up the rest of the commands on the page. Take 10d10 SAN loss. ... Oh dear, that takes you negative. Very negative. So negative that you suffer a contagious psychotic break: your madness cuts holes in reality itself, and faceless daemons from unknown realms emerge to plague a yet unknowing mankind. The CRMs greet one as "Doug". You weep tears of blood and gibber quietly as the veil shreds and you realise that you sit in the cubicle by the toilet door in the fluorescent cube hell of Gehenna.

It is now five forty-seven, Friday afternoon.