Managing your known hosts file

Written by Greg Keller on September 10, 2015

Share This Article

When you connect to a web site, you want to be assured that it’s the correct one, and not a fake site pretending to be something it’s not. Imagine if you attempted to go to Amazon but someone managed to intercept your traffic? You could end up providing them your credit card and personal information, and you’d be in a bad spot.

The same is true for connecting to your remote servers using SSH. When you establish a connection from your client you need to know you’re actually communicating with your server and not someone spoofing your server and capturing your sensitive data. This is known as a “man-in-the-middle” attack.

A man-in-the-middle attack is one where a bad guy has managed to insert themselves into the communication chain between you and your server. Note that this isn’t all that unreasonable — network traffic typically jumps through several relay points before reaching its destination. If someone malicious is at one of those points listening for interesting traffic, they can pretend to be your remote server and steal your confidential information. They can even act as you on the remote server after you’ve established your identity to it!

So how do we prevent these attacks? We need to have a secure mechanism for identifying the other server — sounds like a perfect job for public key authentication. Basically if SSH has the remote server’s public key, it can establish identity by requesting that the remote server sign a chunk of data with the associated private key. This is exactly analogous to the ceremony used by the remote ssh server to establish the user’s identity subsequently, as seen in this webinar.

Your SSH client has a list of these remote hosts’ public keys stored in a “known_hosts” file. By default each user has their own instance of this file, living under their home directory in ~/.ssh/known_hosts. With every remote connection it uses these keys to securely establish the identity of the remote host. If the keys don’t match, you see this scary warning.

The RSA host key for test_machine has changed,
and the key for the corresponding IP address
is unknown. This could either mean that
DNS SPOOFING is happening or the IP address for the host
and its host key have changed at the same time.
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
Please contact your system administrator.
Add correct host key in /Users/topher/.ssh/known_hosts to get rid of this message.
Offending RSA key in /Users/topher/.ssh/known_hosts:4
RSA host key for test_machine has changed and you have requested strict checking.
Host key verification failed.

At least they make it very clear that something is amiss!

One fundamental question is “how does this file get populated?” Since this trust relationship is so important, how do we add new servers to the list?

When you first connect to a new SSH server, your SSH client will let you know it doesn’t know who this server is, and ask you if you want to trust it. Provided that you do trust it, it will add the key to the known_hosts file and subsequent connections will be based on this known identity.
The authenticity of host ’test_machine (’ can’t be established.
RSA key fingerprint is 81:c0:f0:b1:2f:e2:9d:69:93:7f:2b:47:44:8d:29:8a.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘test_machine,’ (RSA) to the list of known hosts.

See the problem here? How do you know the machine is correct on that first connection? What if an attacker is already in between you and the machine, and capturing everything? You’re about to say you trust the incoming public key for all time. A MitM attack here can permanently compromise that machine and your communication with it.

We need to have an “out-of-band” mechanism for delivering the public keys. While the SSH client will automatically populate the known_hosts with new hosts, this is a convenience mechanism that breaks security. Your user should already have this validation mechanism in place the first time they connect, which means the known_hosts must be pre-populated.

You could do this manually — for every new server (and new user) securely copy the known hosts to the users’ home directories using something like `scp` or by logging in with a privileged account. Obviously this doesn’t scale very well, but it’s a good quick solution.

Configuration managers are another (better) solution. Using something like Puppet or Chef you can make sure that all your users are constantly provided with the latest version of all of their servers public keys. Right now this is probably the best solution, although it does have the risk of making public keys manipulatable by anyone with access to the configuration manager scripts.

Finally, you can put the onus on the users. Public keys are safe to share so long as you know to trust them. One alternative is that you, as operational manager, sends out a list of hosts public keys (or a known_hosts file) that is signed with your own private key. Assuming your colleagues have and trust your public key they can validate the integrity of the message and use it for their own known_hosts file.

Trust and identity are fundamental to security. Establishing a connection to a remote server is a high-risk activity, and it should be done correctly. Blindly accepting the public key the first time you try to sign in is folly, and could expose you to a man-in-the-middle attack. Be sure to use one of the mechanisms above to propagate a server’s public key “out of band” to securely establish identity in your network.

Greg Keller

JumpCloud CTO, Greg Keller is a career product visionary and executive management leader. With over two decades of product management, product marketing, and operations experience ranging from startups to global organizations, Greg excels in successful go-to-market execution.

Continue Learning with our Newsletter