October 6, 2014

Protect your Azure Linux VM (aka: How to Avoid a $1500 Charge)

Back in July, I got a Pingdom alert that an insanely low traffic site I was working on was down, after a quick little investigation it looked like the DB layer was down. (Just for context the only people using this website were me and about 3 other people, our DB had 3 records). I tried to SSH into the machine and I couldn’t. I logged into Azure and tried to restart the machine and was unable too. I then noticed the network graph, it was massive. I’m talking like 50Gb/s massive (note this thing was getting at most 1 req/s from pingdom before, according to my bill I use 12+ terabytes of bandwidth). So to play it safe I just deleted the VM and sent an email to Azure billing alerting them of the issue and asking if I was going to be held liable. I’ve done a bit of data center colocation work our IP transit providers were normally pretty lenient for charges based on hacked machines or attacks. At about the same time I got an email from the Azure Safeguards Team saying:

The Microsoft Azure Safeguards Team has determined that a Denial of Service Attack is originating from your Azure deployment (VIP: 138.91.***.***, Name: law****-**). We have included the details of our investigation below.

Such behavior violates the Microsoft Azure Services Terms and your azure deployment has been suspended. The Microsoft Azure Acceptable Use Policy and other agreement terms can be found at http://azure.microsoft.com/en-us/support/legal/. Repeated violations of the Azure Acceptable Use Policy may result in termination of your AzureSubscription.

The Microsoft Azure Safeguards Team ensures that customers abide by the terms of use and investigates allegations of misuse. This allows Microsoft Azure to provide a safe and reputable service environment for all.
If you are surprised by this activity on your Microsoft Azure deployment, have additional questions, or believe that this suspension is not warranted, please reply to this email and let us know. You can also contact Microsoft Azure Customer Support at http://azure.microsoft.com/en-us/support/options/.

This explains why I couldn’t SSH in or restart the machine, I replied back saying that my machine was compromised and I also deleted the machine from my deployment.

After about a day I heard back from billing support, they respond back and say:

It goes without saying that we would refund the amount that you incurred due to this event. But before we come to the conclusion that your VM was indeed compromised, I would like to know what led you into believing that this was due to an external factor (a hack. in this case). Although unlikely, do you think this could have been caused by a spike of one (or a bunch) of the processes running on the VM itself?

Awesome, since my machine was compromised they were gonna be nice and refund the costs, they just want to verify it was compromised. Well I got an email from the Safeguards team saying it had launched a DOS attack, and I don’t think that I would be the kind of guy to purposely launch a DOS attack (I’m a 4 year Microsoft MVP). I also tell them that I hadn’t logged into the machine in like a month, the website that this db served has no weird traffic patterns, and then he asked what OS I was using, I replied “Ubuntu”. Microsoft said they’d investigate and get back to me, here is their response:

It turns out that the Security Team cannot investigate on this, as the data transfer is happening on a Non-Windows Virtual Machine. Looking at the data transfer, it does looks like the Virtual Machine was compromised. I would suggest you to verify this from your end as our Security Team unfortunately cannot help.

So essentially I ask “well now what? We agree that I didn’t do this intentionally, and the machine has been suspended and then deleted to prevent further damage” and their disappointing reponse:

I have an update from the escalation team on this request. Now, because the compromise was caused due to a security or a vulnerability of a non-windows OS, Microsoft will not be responsible for this. I am afraid, a refund cannot be provided.

At this point, I got the ticket escalated and the response was:

Good Morning. As you are aware we have been in touch with our Business Management team to check whether we have the approval to process a refund for the charges. Kindly be informed that we do not have the approval to give refunds for usage that comes as a result of the VM being left in an insecure state and then getting compromised because of the insecure state. We request you to follow good security practices and keep the VM secure at all times.

Okay, I left the machine in an insecure state, sure I could have / should have set up fail2ban, or maybe disable password login and only allow SSH keys, but I was using the default configuration. I went to the Azure create a VM page, selected an Ubuntu image, set a random password (it looked something like this: Ac12sd5s.5) and that was it. So simply creating an Ubuntu VM, choosing a decent password, and using all the default settings will leave your machine in an insecure state, and you liable for $1500 work of charges before they suspend your account

Once the charge came through to American Express, I disputed it, Microsoft never responded to the dispute, so that was it.

tl;dr: As soon as you set up your Linux based Azure box, install fail2ban, setup ssh keys, and set a cost threshold in Azure billing.