Episode Show Notes

							
			

[FULL TRANSCRIPT]

JACK: [MUSIC] There’s a big list of all known security vulnerabilities for computers. You want to know what the oldest known computer vulnerability is? The oldest I could find is weak default passwords. This has been a known vulnerability since 1969. Specifically, computers sometimes have the username admin with the password also admin. Then the computer doesn’t ask you to change it when you buy it, so it can stay that way for a long time, years. Many computers after that also use admin/admin as the default username and password. Over the years many hackers have been able to get into many systems that they didn’t own using this basic username and password. Now it’s been forty years since we became aware of this security weakness. Surely by now this weakness has been resolved, right? There aren’t any computers in the world that have this username and password anymore, right? Right? I sure hope so.

JACK (INTRO): [INTRO MUSIC] This is Darknet Diaries, true stories from the dark side of the internet. I’m Jack Rhysider. [INTRO MUSIC ENDS]

JACK: In 2012 a security researcher began scanning the internet to see what computers are still running Telnet. Telnet is a way to log into a computer remotely but it doesn’t have encryption, so when you log into a computer using Telnet you send your username and password in clear text for anyone to see on the internet. The alternative is to use SSH which does the same job but it’s encrypted. SSH has been around since the 90s so there’s really no excuse to run Telnet anymore, but while this security researcher was scanning the internet trying to see how many systems were running Telnet, they also wanted to see how many systems are using those default passwords. They used the following four username/password combinations: admin/admin, admin with no password, root/root, and root with no password. They took these four username/password combinations and started scanning the internet to see if any systems would let them log in using Telnet. They were finding unsecured systems pretty quickly. [BLIPPING] But it took them over sixteen hours just to scan 100,000 IPs.

The internet had almost four billion IPs so scanning the whole thing poses a big challenge. If they were to scan ten IPs a second it would take them ten years to complete the scan. The researcher thought if they had two scanners it would go twice as fast, and one hundred scanners would go one hundred times faster. Since the researcher was finding all these systems on the internet that they could log into as admin, then why not put those systems to work to help scan the internet? The researcher created a program that would scan and find unprotected systems and then upload that same program to the systems it found, and then put that system to work scanning for more systems. They were creating a botnet. A botnet is a program running on many computers that are all working together to do the same task, but the botnet creator doesn’t have permission to use any of those computers. Actually just logging into one computer as admin that they didn’t own was illegal. Of course, it was very illegal to do it to thousands of computers.

The researcher knew this was illegal and had to stay anonymous and not get caught. They let this program run and propagate all over the internet all night long. The next day the botnet had spread to 30,000 computers and wasn’t even close to finishing the full scan. After some tweaks and more testing and more scans, the botnet finished the scan of the internet looking for all devices running Telnet that had those default passwords. [MUSIC] The botnet discovered 1.2 million of these kind of devices. Many of these vulnerable devices shouldn’t even be on the internet. They were TVs and [00:05:00] industrial control systems, cameras, water sprinklers, none of which should be accessible from the internet. Out of those 1.2 million vulnerable devices, the botnet got installed in 420,000 hosts. Not all the systems could run the program and they didn’t want to install it on any industrial control systems.

Controlling 420,000 machines all at once was a complicated task. The researcher had to set up an elaborate system which included middle nodes and N nodes and each system had to be controlled individually to perform a different task. Some systems would get rebooted and their IPs would change. It was a constant changing environment. What would you do if you had control of 420,000 computers? With that many computers you could do a massive denial-of-service attack against your enemies or try to infect the world with a terrible virus. But this person had no evil intentions as far as we can tell. They were just a security researcher that was willing to break a few laws to try to understand the internet further. They now had a new mission which was to get a detailed scan of the entire internet. The first mission was just to see how many devices were running Telnet with default passwords.

This new mission was to use those vulnerable devices to do a full scan of the internet; not just checking for Telnet but pinging every IP and checking the top 100 ports. In 2012 there really wasn’t that much data of people scanning the entire internet, partially because it just takes so long. If you were to scan ten IPs at a time it would take you over ten years to complete it. There were over 3.6 billion IPs allocated at the time, so scanning the whole internet required a lot of storage for the results. It also required a lot of time to complete the scan. They wanted to use this botnet to try to quickly scan the internet and see what’s out there. The scan they decided to do did numerous checks to see if the IP is alive, they would ping it and map it, and test to see if the top 100 ports were open on it. Even though the internet had almost four billion IPs on it, the scan would really need to make over 60 billion probes to test all these different things. But with the help of 420,000 systems they calculated they could scan the whole internet in an hour.

But this creates a new problem; storing that many scan results creates a major logistics issue. We’re talking about having the ability to receive over one million events per second of data coming back from the scan. The researcher built a web application using Python and PHP and used a dupe as a database. At this point the botnet was now fully built and ready to conduct a full scan of the internet. [PIANO] Researcher looked at this creation and decided to call it the Carna Botnet. Botnets are sometimes named after Roman or Greek gods and Carna was the Roman goddess known to protect the vital organs of the physical body. [ELECTRONIC] While the researcher was setting up the botnet, they noticed something strange. They were finding someone else was also building a botnet and using the exact same vulnerabilities. They were finding this other botnet on the same computers that the Carna Botnet was installed on. It was known as the Aidra Botnet. But the Aidra Botnet had malicious intent.

It was being used to take down computers and did bad things. The researcher was able to detect that Aidra had infected over 30,000 of the same computers as the Carna Botnet. Being in this unique position, the researcher decided to block the Aidra Botnet from accessing devices. They were able to remove Aidra from the system and block that IP so Aidra wouldn’t come back. So Aidra started losing numerous nodes because of this. It fascinates me to think about these two botnets out there in the world, battling each other. After the Carna Botnet was built and more tests were done, it was time to conduct the full scan. The researcher gave the command for all 420,000 systems to scan the entire internet and it worked. All public IPs in the world were scanned and the data was collected on the results but to the researcher, that wasn’t enough. After building this massive botnet and an incredible infrastructure to support it, a single scan just wasn’t satisfying enough.

They decided to scan a second time, and a third, and a fourth. In fact, they continued to scan the entire internet over and over, repeating it again and again weeks after weeks, month after month. Because hour by hour and day by day, the internet changes. So by conducting numerous scans of the entire internet would be the only way to understand exactly what’s out there. After six weeks of continually scanning the internet and collecting all the data, the researcher shut down the botnet. All the programs that were on the infected hosts quietly deleted themselves and all systems were returned just to how they were before the botnet was installed. That’s the end of the story for the Carna Botnet. [00:10:00] Now begins the story of the internet census. [MUSIC] With all these billions of probes and data points collected from the Carna Botnet, it was now time for the researcher to pour through all this data and try to make sense of it. The researcher called this project the Internet Census of 2012.

Because there was so much data it was not easy to figure out what to do with it. The researcher analyzed and calculated and reviewed the data in numerous ways. Now, I think what this researcher did next was absolutely brilliant. Yes, the work they did up to this point was brilliant as well, but if they just published this data in a big spreadsheet and 40 page report, it probably would have gone unnoticed. All this data that’s in the database is interesting but it’s boring to read. It’s like reading a really dry technical book that’s just too long. Regions of the world were assigned a range of IP addresses. Africa gets one block, US gets another, and so forth. But even more specific states and cities are also given IP address ranges. The researcher started adding geographic locations to all the data they collected. GeoIP lookups were done on every IP address to determine where that computer was in the world. Eventually the data started to tell a story. The data was showing which IPs were online and where they were in the world. The researcher compiled all of this location data and placed it over a map of the world. This had amazing results.

[MUSIC] The security researcher compiled all the data and published it anonymously for the world to see. This included a lot of details on how the Carna Botnet was created as well as how all the data was collected, and of course the map of all the computers in the world. You know what, you’ve gotta see this map for yourself. If you can, right now, stop what you’re doing, go to darknetdiaries.com, find the Carna episode, and let’s take a look at this map together. I’ll pause for a minute for you to load it. [HUMMING] Okay, on the map you’ll see lots of dots. There’s a dot on the map for every computer in location that was in the database. There are billions of dots. It’s hard for me to describe it. It’s truly a case where the data is beautiful and brilliant and magical, but that probably doesn’t describe anything so I took a trip to my local hacker space and asked some friends to describe it.

TED: It’s pretty.

GREGORY: It’s true, it is pretty.

BARRY: That’s insane.

KURT: Pretty cool, pretty cool.

CURTIS: Wow. There’s a lot of dark areas. No real surprise. This is impressive.

GREGORY: Pretty colors.

JEN: It looks remarkably, densely internetty.

ALLEN: The amount of technology that is on this planet, just at a glimpse is insane.

ZACK: I didn’t expect Brazil to have that much. Brazil is a lot denser than I expected it to be.

MICHAEL: Yeah, absolutely.

CARLOS: Seems kind of surprising that Europe seems to have a greater concentration than the United States. Yeah, I would have expected the United States to be a lot more red than that.

JACK: The map they’re looking at has billions of dots all over the globe. In regions that have a high concentration of computers online will show up red and very bright, and in regions that have a low number of computers, they show up blue. In areas that have no computers are completely dark.

ZACK: The brightness of America doesn’t surprise me at all.

BARRY: Australia is like – the coast lit up and I know that Australia is really barren in the middle. It’s mostly desert but it’s still nice to see how tightly packed it is towards the water.

KURT: New Zealand is amazing. It’s like, the whole thing.

CARLOS: Look at the islands in the Caribbean. They look almost like they’re forming a continuous line of lands all the way up to Florida.

JEN: I’m looking at bright spots in the middle of the water and thinking about what that means.

JACK: But this map is even more amazing than just dots on the world. The researcher had so much data from scanning the internet over and over and over that they were able to create an animated map showing the daytime/nighttime cycle. [00:15:00] Along with this animation we can see what hour of the day different regions of the world come online and go offline.

JEN: I’m watching the sun shadow pass over the lights and matching that up.

TED: What I’m seeing is – when you look at it you’re seeing it how it lights up. It lights up in almost a cascade. It goes from bottom to top in a wave, basically, in how it lights up. Really interesting.

GREGORY: Italy goes full load earlier than the rest of Europe. It’s almost like Italy is a couple hours ahead of the rest of Europe ‘cause it also – it surges earlier and it drops off earlier.

ALLEN: Middle of Australia, there’s a huge area where there’s no computers turning on and off.

ZACK: Notice how also bright that India is.

KURT: I love that when you go way up north, like North Pole, Greenland and stuff, you still see activity like way, way out.

SHAUNTY: Can we zoom in? Can we zoom in?

JACK: Everybody who I showed this map to marveled at the magnificence of the data they were looking at. Some people noticed Los Angeles comes online about the same time as New York. Some people notice it’s completely dark in North Korea, and other people saw that Canada, Russia, and the northern parts were all dark except Scandinavia. Even at extreme northern latitudes, it’s lit up. Because the security researcher created such a beautiful map to display the data collected, this map went viral and spread across the world.

Everyone got to marvel at how big the internet was. This is the first map of the internet and it amazed us all. Now, a half decade later, I still see this map pop up in my social feeds from time to time with someone new discovering it and swooning over its beauty. Most people see this map and have no idea what it took to create it. But because of how beautiful the map is, to them it doesn’t matter how it was created. It’s still marvelous and worth spending a minute to look at. The creator of this botnet remained anonymous and nobody ever openly took credit for this. This is because even though the Carna Botnet had good intentions, it was still illegal since it uploaded and ran programs on machines that weren’t owned by the researcher. The botnet creator had to stay hidden and anonymous after publishing the data. This story probably would have ended right here if it wasn’t for one person.

PARTH: My name is Parth Shukla. I’m currently a security engineer at Google here in Switzerland. Previously before Google I used to work for AusCERT, the Australian Computer Emergency Response Team based in Brisbane in Australia. When I first read about this I had just started working for AusCERT. It was my first month. This is my first IT security job ever. I was still studying at the time, I still hadn’t graduated. I was the newbie in – I read this thing, I went well, this is interesting. I don’t know what we’re supposed to do. I’m looking for guidance from the senior people because I’m not sure what the standard response procedure is within the company.

I think someone suggested just e-mail the guy. I went what? They’re like yeah, just e-mail him. Maybe he’ll give you something. I think it was actually in jest. They made a joke, it was like yeah, as if you’re gonna hear back. I’m like okay, I guess I can do that. So I found the e-mail that was, I think on the GitHub page already and I sent him an encrypted e-mail saying hey, can you give us since we’re AusCERT, we’re supposed to look after the Australian interest. Can you give us the compromised IPs that you used for the botnet scan for Australia only? I got back a response that said actually, you’re the first person to contact me and here is everything. I was pretty shocked. That’s how that started.

JACK: When he read about the Carna Botnet, there was one thing that stood out to him. Those 1.2 million systems that were on the internet running Telnet and using default passwords. He thought there should be no reason for this many unsecured devices to be out there. He wanted to understand that problem further. When he asked the botnet creator for just the vulnerable devices in Australia, the researcher gave Parth the full list of all 1.2 million vulnerable devices.

PARTH: The data itself was about 882 MB. It was a big text file that was formatted with tabs and it just basically contained MAC addresses, manufacturers, RAM, hostname, CPU info, IPs, country codes of all the devices. Approximately 1.2 - 1.3 million.

JACK: Parth got busy trying to make sense of the data. First he did everything he legally could do to verify the data. He organized the data in different ways, figuring out which countries had the most vulnerable systems and which manufacturers were responsible for [00:20:00] creating the most vulnerable devices.

PARTH: To me, these kind of – this indicated, for example, the manufacturer indicated this was a systemic issue. They were building and shipping devices that were vulnerable from the factory and they were shipping them en masse and that’s why they were – this one or two manufacturers – I think there were like, three really big manufacturers that were over-represented in the data. For the IPs it was a little harder because certain countries were over-represented but they also had more devices allocated to them globally anyway. Percentage-wise, they were not that bad. Actually, one of the things that I did in my research paper is, I tried to figure out how easy it is – it would have been to find a vulnerable device.

If you started scanning a random IP range in a particular country of interest, how long would it take you? I published a table as part of my paper of the number of seconds it would take you to find a vulnerable device given the statistics we have. We know from all the internet registries, all the allocated devices – sorry, all the allocated IP ranges for each of the countries. We know from Carna Botnet all the number of devices in each country. We can do some simple maths to figure out percentages and likelihoods. I think, for example, the device – I think for Australia, for example, what I was interested in, if you started scanning randomly within just the Australian IP address range it would take you about an hour on average to find one vulnerable device. Whereas in China it would take you an average of about twenty seconds.

JACK: When Parth started realizing how vulnerable the internet was, he decided to do something about it.

PARTH: The end result was that I talked to over twenty CERTs from different countries. I notified all the CERTs that had more than ten thousand devices in their countries. I actually e-mailed them a copy of the relevant data. For example, for the US I would have sent them a copy of all the US compromised devices. For China, I sent them all the Chinese compromised devices. The intention there is, this is kind of the job the CERTs are trying to coordinate with other national agencies who would know better how to handle the situation in their local country. The Chinese would know okay, which manufacturers or which carriers they should go talk to and they have their own national contacts. To me, the responsibility here lies more-so with the manufacturer because they sold you a device with certain promises. From the manufacturer sides, I actually contacted the IEEE.

JACK: The IEEE is an organization that creates standards for electronic components. They are the authority figure for which manufacturer can use which MAC address. The MAC address is a local designator assigned to every network interface on every device in the world. Parth had a list of 1.2 million MAC addresses as part of the data he got from the botnet creator.

PARTH: I went to the IEEE and said these are the manufacturers we have derived. These are the top ten or twenty. I can’t remember the exact number. Can you give us their contact details? Because I want to contact them. You should have the authoritative information on this; I don’t want to just go on ‘cause a lot of corporations can share the same name or have similar names. I want the authoritative info from you and if I remember correctly, they denied the request. They said they can’t share it for privacy reasons. But they said if I had something to pass on, they would pass it on. I remember writing quite a terse letter with my contact details and saying please reach out to me; I have something to share with you.

I know that ten or fifteen manufacturers I reached out to via the IEEE, only one replied and that was one of the Turkish manufacturers that was quite well represented for Turkey. They contacted me asking for more details, then I contacted them back. I think we did some phone calls to make sure authenticity was good. Then I sent them an anonymized version of the data, so I removed basically the IP addresses but I sent them just the devices that had them as the manufacturer to help them figure out which of their particular devices are actually vulnerable. I’m hoping the Turkish one ended somewhere. I haven’t heard from them since. I gave the data and fingers crossed they did something good with it.

JACK: With the data Parth collected from the Carna Botnet, he made it his mission to try to resolve this problem of so many vulnerable systems being online. He thought by contacting CERTs in other countries he could help clean up the vulnerable devices out there. By contacting the device manufacturers he could stop them from creating vulnerable devices, but it didn’t seem like very many CERTs or manufacturers were interested in helping solve the problem. Parth was having a hard time getting [00:25:00] organizations to pay attention to this problem. But there were some people who were paying attention to this data. [MUSIC] Hackers with malicious intent were seeing how the Carna Botnet was created and started making their own botnets using the exact same methods.

PARTH: There’s been multiple – and I’m sure there’s hundreds of them running right now. The tool called Lightaidra exploited the exact same vulnerability. It was released in parallel, I think, just a little earlier before the Carna Botnet data was released. I think it was independently discovered. It’s not a complicated issue to be discovered, right. That led other people in the community to go hey, this is so simple. I just click and like I said, on average about, depending on where you point, on average anywhere from ten seconds to 180 seconds you’ll definitely find an IP address that’s vulnerable. That’s a really good hit rate.

[MUSIC] What I really like about the Carna Botnet data, in hindsight, is it came before this became a big thing, before many of these botnets started forming, exploiting the same vulnerability over and over again. I feel like we have the largest, most accurate data before other botnets took over and started shutting down the port, Telnet port, which would stop further investigation. This, to me, seemed like a really nice imprint, the 1.3 million devices vulnerable worldwide, quite accurate at the time because he did it multiple times over a course of months. I’m referring to the anonymous researcher as he. I don’t actually know if that’s true. For the whole year – I worked for AusCERT for about a year and a half and out of that, for a whole year, I was working on just this. I was very lucky, very lucky that AusCERT allowed me to spend that kind of time on something that wasn’t actually related to Australia.

JACK: There were a few people in the security community that condemned the data that came out of the Carna Botnet, saying that because the data was illegally obtained we should not use it for any legitimate research.

PARTH: I agree it’s an illegal botnet. There’s no way I can disagree with that statement. The use of the data, I guess obviously my position’s been clear since you can see how I’ve used it. I haven’t really had any big, ethical qualms about it but in my opinion, the reason I think the researcher even bothered to give us this data is that he also wanted this problem fixed. That’s very clear by the multiple e-mails. I sent them quite a lot of questions and he continued answering them. When I did my first presentation at the AusCERT conference, I sent him the slides and he replied he was happy with that outcome.

Since then he’s stopped communicating. My conclusion from those events is he got what he wanted. He wanted publicity. He wanted a proper analysis done from someone that has good reputation as AusCERT does in Australia. Once he got all of those, he was happy. I see that the reason he went into this effort to provide this data, to answer this questions, is that he didn’t create this botnet because he wanted to own the world and destroy things and make a profit. He realized it’s a problem and he wanted it fixed.

JACK: Parth, do you think you were the only person to contact the botnet creator?

PARTH: Yes. The creator said so. The last communication I had with the creator – we exchanged two or three e-mails and the last one I just checked. I think it was a few months after our initial contact. I said hey, has anyone contacted you yet? The response was no, you’re still the only one. Then I haven’t had any contact with the researcher since.

JACK: Still to this day, the creator remains anonymous but do you have any thoughts on who it might be?

PARTH: At the time in 2012, storage of nine terabytes of data was not cheap. He had to store it and he had to compress it using ZPAQ which is incredibly CPU expensive. Actually, related to this, for the internet census part, the public data, I did my undergraduate thesis on it. I had to decompress that data to be able to access the raw data so I could index the raw data and then do some analysis on it. That took me – the university had a high performance computing cluster of I think, four hundred machines, I think six hundred CPUs and even on that it took a day to decompress all this data. From 500 GBs, and once you decompressed it, it became nine terabytes.

That decompression took me a day on a high performance computing cluster with three hundred CPUs. From my mind I just went, whoever this is obviously has a lot of money because the claim was he did it on an Amazon cluster. This would cost ridiculous – back in 2012 with Amazon prices try storing [00:30:00] nine terabytes there for more than six months, then continuous collection, and then CPU crunch to compress it so you can upload the 500 gigabytes. I just see that there’s a lot of layers here, where the only conclusion I could come up with was this was probably an already established researcher who was doing some private home research and didn’t want the data associated with his public identity. That’s the best I could come up with.

JACK: Why did you stop working on this data?

PARTH: This is a battle that seemed like we should be able to win but I made no progress. My focus has now personally shifted towards focusing on problems that I can fix at hand. Whenever industry-wide impacts like these are necessary, you actually have to propose a solution at a specification level. For example, MAC addresses are controlled by IEEE. If IEEE made a mandate on something then these manufacturers would be forced to follow it. Currently the IEEE is not in the business of making mandates on security. That would be an uphill battle but that’s a battle that now you actually know a specific person, a specific entity that you can get involved with. There’s [inaudible] that are open to participation to a certain level of people, then now you have some hope of how you can address this systemic issue through one – by catching yourself with the core problem. For this particular example, I don’t have a solution but I’m just giving an example. IEEE is an example.

One of the reasons I dropped working on this a lot, I left AusCERT, that’s for one thing, but I also haven’t spent any significant time chasing this up because I think this is a dead-end. Trying to get manufacturer to pay attention through the public-face is a nightmare because what matters to manufacturers most is maintaining good PR. If that’s how you attack them, then they’re going to be defensive. The way to get the problem fixed, if that’s what you really care about, is to go through the back channels. Find the engineers, find the people who know what it is, who actually make these assigns. A lot of times what I find, there is a tendency and security go – look at these developers. They don’t know what they’re doing. They’re idiots, right? But what I find a lot of times, is if you talk to these engineers who actually made these products, who decided to leave Telnet open with default credentials, you realize that given the circumstances they were in, it was not a stupid decision.

They had deadline crunch, they had all these other things that are happening. They had to leave the default creds open in case the device wasn’t set up correctly, so Help Desk can dial in remotely and make sure that everything works properly for the layman consumer. There’s all these requirements that are imposed on these engineers. They try their best to convey them and sometimes they don’t have security backgrounds nor are they trained to be aware of these security problems. When you actually sit down and have a chat with them or convince them, a) I think it’s a lot easier to convince them because they can see problems because they have the same mindset and then once they see them, they will start looking for a proper solution themselves. Then you can exclude yourself from this problem now because you’ve just made the correct people aware what the problem is. That’s the lessons I’ve taken away. This was a brutal entrance to the security industry for me. It’s my first thing I did in this job. There’s nothing more I could do, right; I tried my best. It’s time to move onto something that’s not as soul-crushing.

JACK (OUTRO): [OUTRO MUSIC] You’ve been listening to Darknet Diaries. For show notes and links, check out darknetdiaries.com. There you’ll find the full animated map of the internet census as well as all the research that Parth has published including his full presentation. A very special thanks to everyone who had comments about the map. This show is made entirely by me, Jack Rhysider.

[OUTRO MUSIC ENDS]

[END OF RECORDING]

Transcription performed by LeahTranscribes