December 2009 – Robin's Interesting Thoughts

I was just explaining to my roommate about phishing scams and why many online banking websites now show you a personal picture when you log in. And I was reminded that the main usability problem at the heart of phishing scams is the URL naming scheme. It’s just unnecessarily complicated to figure out.
What do I mean? The very heart of a phishing scheme is a URL at the top of the page such as:

http://www.bankofamerica.com.online.b04k.li/login.html

And the only way to know that it’s a phishing site is to consciensciously look at the last part of the first part of the url, which is the part that has all period separators and comes before the first slash, except after the two slashes at the very beginning. Sheesh! Although web nerds have gotten used to this, it does not even remotely resemble an intuitive user experience. People see the “bankofamerica.com” portion out of the corner of their eye and assume all is well.

If URLs simply worked from left to right, the real Bank of America would be: http://com.bankofamerica.www/login.html and the phishing scam would be: http://li.b04k.online.bankofamerica.www/login.html

Then at least we could tell everyone to just look at the leftmost thing (after the unchanging http://) and make sure it is familiar.
Of course, this is not really an option anymore because there’s way too much infrastructure in place using the existing naming scheme. But why don’t web browsers at least highlight the important part of the URL for you? It could look something like this:
And then the scam would at least have a chance of catching your eye:
And I could tell my grandma, “just look at the bold red portion before you enter your password.”

Does anyone know why the major browsers don’t already do this?

Update: Dave just pointed out that Internet Explorer 8 has indeed publicly announced a similar domain highlighting feature:

Domain Highlighting lets you more easily interpret web addresses (URLs) to help you avoid deceptive and phishing sites that attempt to trick you with misleading addresses. It does this by highlighting the domain name in the address bar in black, with the remainder of the URL string in gray, making for easier identification of the sites [sic] true identity.

Update 2: Google Chrome does something similar — it colors the “https” green if the site comes with a valid security certificate. It also makes the domain name darker than the stuff after the “/”, but it doesn’t do anything to distinguish the top-level domain pieces. So it is still open to phishing attacks like “www.bankofamerica.com.online.b04k.li”. Hopefully, phishing sites wouldn’t be able to get a green “https”, but the lack of a green prefix seems a lot less noticeable than the clear presence of a suspect top-level domain.

My uncle recently asked me a variant of this question, and I learned a few new things after doing some wikipedia research. Here is my attempt to explain it using language everyone can understand.

Part of what makes the internet work at all is that it is designed to be distributed — there is as little hierarchical control as possible. The big idea is to let anyone connect to anyone without going through some commander at the top. If everyone had to go through the top, then it would become a huge bottleneck.

A “web host” usually means any company that hosts web pages. This just means that they own computers that are connected to the Internet. Of course, your everyday desktop computer is also connected to the internet, but for a variety of technical and financial reasons it usually makes more sense to go through a “web hosting” company if you want a web site that is going to be available 24/7 to anyone in the world. But the point is, anyone can connect a computer to the internet, and thus anyone can be a web host — there are no qualifications. And that is part of why the Internet works at all.

However, the story is different for getting domain names. For domain names, some hierarchy is unavoidable, because you need some central way to determine who controls which names and which websites they point to. You want to be sure that “amazon.com” always goes to amazon and not “buy-stolen-belts-for-cheap.com”. In other words, you need to direct people to the right internet-attached computer. (There is also some hierarchical control needed for various other technical pieces of the Internet.)

According to the wikipedia articles, the US Department of Commerce is theoretically in charge of overseeing those aspects of the Internet that need some hierarchical control. However, they outsource the entire job to a non-profit corporation called ICANN – the Internet Corporation for Assigned Names and Numbers, which for historical reasons is based at the University of Southern California. ICANN has the authority to (1) make certain policy decisions, and (2) outsource the management of sets of domain names — like those ending in “.com”, “.org”, or “.net” — to various other companies. For example, a company called VeriSign is in charge of handling all “.com” domain names, because they won that contract from ICANN. (Part of the contract specifies certain rules, such as limits on the fees they can charge.) But VeriSign, in turn, only handles the actual repository of domain names, and outsources the job of actually dealing with customers to still other companies! But those other companies have to be “accredited” according to certain standards set by ICANN.

For example, I have control of the domain name “robinstewart.com”. I used a company called DreamHost to process that registration and collect payment of $9.95 per year. Part of that money (about $3) goes to DreamHost, for dealing with me, the customer. Part of it (about $6.50) goes to VeriSign, for keeping track of all the “.com” domain names and making sure there are no conflicts. And a very small part of it (20 cents) goes to ICANN, to continue to make policies and track down anyone abusing the system.

So there is a large hierarchy of organizations who all basically operate under the authority of ICANN, which in turn has some sort of mandate from the US government. And there are some international governance boards and treaties, but for various political reasons (i.e. the system is working, so why change it) the whole thing remains US-based.

And there you have it.

Month: December 2009

Simple defense against phishing

Where do domain names come from?