Kiip recently completed a migration from EC2 to VPC. VPC exited beta and became generally available in all regions in August, and allows you to provision compute nodes within a virtual network in AWS. For anything more than simple websites, we believe migrating to VPC is something worth doing, or at the very least worth investigating.
Because Amazon VPC is a new service and requires a substantial amount of domain knowledge, this article will first cover a quick intro to the benefits and parts of building a VPC. Specific details about our VPC architecture here at Kiip, tooling we’ve built around it, and our migration process will be covered in future posts.
What’s Wrong with EC2?
Prior to VPC, like most other companies on AWS, all our nodes were in EC2. EC2 is great, but taking a step back it is a pretty strange production hosting environment for various reasons:
- All nodes are internet addressable. This doesn’t make much sense for nodes which have no reason to exist on the global internet. For example: a database node should not have any public internet hostname/IP.
- All nodes are on a shared network, and are addressable to each other. That means an EC2 node launched by a user “Bob” can access any of EC2 nodes launched by a user “Fred.” Note that by default, the security groups disallow this, but its quite easy to undo this protection, especially when using custom security groups.
- No public vs private interface. Even if you wanted to disable all traffic on the public hostname, you can’t. At the network interface level each EC2 instance only has one network interface. Public hostnames and Elastic IPs are routed onto the “private” network.
The above notes have caused serious security issues in practice. There was even a presentation on public mining of memcached instances on EC2 which were discovered due to poor firewall setups. Certainly, you can work around it, and most companies do via aggresive
iptables rules and AWS security groups. Unfortunately, it is still very easy to shoot yourself in the foot, as is evident by the previously linked presentation, where big companies such as Bit.ly, Gowalla, and PBS accidentally opened their cache to the public.
What’s Great about VPC?
VPC, unlike EC2, provides you with a virtual private network with control over things such as routing tables, DHCP option sets, and more. This provides numerous benefits over EC2.
First and foremost, VPC provides an incredible amount of security compared to EC2. Nodes launched within a VPC aren’t addressable via the global internet, by EC2, or by any other VPC. This doesn’t mean you can forget about security, but it provides a much saner starting point versus EC2. Additionally, it makes firewall rules much easier, since private nodes can simply say “allow any traffic from our private network.” Our time from launching a node to having a fully running web server has gone from 20 minutes down to around 5 minutes, solely due to the time saved in avoiding propagating firewall changes around.
DHCP option sets let you specify the domain name, DNS servers, NTP servers, etc. that new nodes will use when they’re launched within the VPC. This makes implementing custom DNS much easier. In EC2 you have to spin up a new node, modify DNS configuration, then restart networking services in order to gain the same effect. We run our own DNS server at Kiip for internal node resolution, and DHCP option sets make that painless (it just makes much more sense to type
east-web-001 into your browser instead of
And finally, VPC simply provides a much more realistic server environment. While VPC is a unique product to AWS and appears to “lock you in” to AWS, the model that VPC takes is more akin to if you decided to start running your own dedicated hardware. Having this knowledge beforehand and building up the real world experience surrounding it will be invaluable in case you need to move to your own hardware.
Whirlwind Intro to VPC
First, a fair warning: VPC is far more complicated than EC2. You have to manage subnets, routing tables, internet gateways, NAT devices, network ACLs, etc.
A VPC is built up of various parts:
- CIDR block – A VPC has one large CIDR block that defines the address space of all nodes that are launched within the VPC. This CIDR block must be a private subnet valid in accordance with RFC 1918.
- Subnets – Within each VPC, there are one or more subnets. A node is launched within a specific subnet. A subnet has its own CIDR block, which must be a subset of the VPC CIDR block.
- Routing tables – Each subnet is associated with a single routing table, which defines how traffic is routed within your network.
- Internet Gateway – Each VPC can have a single internet gateway associated with it. Internet gateways allow internet addressability via elastic IPs as well as allow nodes to talk to the global internet.
- NAT devices – NAT instances are what allow nodes in a private subnet to talk to the global internet. Note that because this is a NAT, the global internet cannot even address the nodes behind it.
Using these building blocks, it is possible to create an interesting arrangement of networks and access control to those networks. Here at Kiip, we use a VPC with a public and private subnet.
Before deciding to dive into VPC, you should be aware of some of the difficulties in that move.
- SSH to private subnet – The nodes in the private subnet are not internet addressable. That obviously means you can’t SSH into them without already being in the VPC network. There are two options for this: SSH into a node on the public subnet, then SSH into a private node, or you can set up a VPN in order to access all the nodes by their private addresses. We decided to deploy a VPN for ease of use.
- Elastic IPs for public addressability – Even if you launch a node in a public subnet, it will not get a public IP or hostname like EC2. The only way to make it publicly addressable is to assign an elastic IP to the node. Elastic IPs are free, of course, but you are limited to 5 per VPC, so plan accordingly.
- VPC to VPC communication is non-trivial – If you’re attempting to have one VPC communicate with nodes in another VPC, get ready to buckle up because its a bumpy ride. The reasoning behind why you would want to do such a thing and how it can be done will be covered in a future blog post, but general awareness that this is difficult should be known ahead of time.
Our VPC Architecture and Migration
In this article, we talked about the issues of EC2, the benefits of VPC, and some difficulties that VPC initially has. Additionally, we provided a whirlwind intro the domain knowledge required to build up a VPC.
In following posts, we’ll cover the architecture of our VPC infrastructure, how it plays a key part in our scalability as well as high availability, and the steps we took to migrate our EC2 infrastructure into VPC with no noticable downtime.