Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Databases Open Source Security

Insecure Hadoop Servers Expose Over 5 Petabytes of Data (bleepingcomputer.com) 51

An anonymous reader quotes the security news editor at Bleeping Computer: Improperly configured HDFS-based servers, mostly Hadoop installs, are exposing over five petabytes of information, according to John Matherly, founder of Shodan, a search engine for discovering Internet-connected devices. The expert says he discovered 4,487 instances of HDFS-based servers available via public IP addresses and without authentication, which in total exposed over 5,120 TB of data.

According to Matherly, 47,820 MongoDB servers exposed only 25 TB of data. To put things in perspective, HDFS servers leak 200 times more data compared to MongoDB servers, which are ten times more prevalent... The countries that exposed the most HDFS instances are by far the US and China, but this should be of no surprise as these two countries host over 50% of all data centers in the world.

This discussion has been archived. No new comments can be posted.

Insecure Hadoop Servers Expose Over 5 Petabytes of Data

Comments Filter:
  • by Anonymous Coward on Sunday June 04, 2017 @01:26PM (#54546793)

    And yet companies keep hiring younger people and getting rid of experienced pros that understand security

    also why is the article making it sound like a Hadoop issue when it's clearly the dumbass millennials that configured these so poorly?

    • Re: (Score:2, Interesting)

      by Anonymous Coward

      At my company, some idiot developer used a public facing URL to put PDFs of our customers' health insurance claims so that he didn't have to write an on-demand report generator to display that same information in an HTTPs session. Even though the file names were pseudo-random, Yahoo quickly crawled it and made the information searchable. It went on for years until a customer called in and asked why his information was found on a Yahoo search.

      That inexpensive off-shore developer cost the company millions..

      • by Nkwe ( 604125 )

        At my company, some idiot developer used a public facing URL to put PDFs of our customers' health insurance claims so that he didn't have to write an on-demand report generator to display that same information in an HTTPs session. Even though the file names were pseudo-random, Yahoo quickly crawled it and made the information searchable.

        So not only was private information made publicly available, the PDF files were in a directory that was marked browseable by the web server? That's extra nice.

    • why is the article making it sound like a Hadoop issue when it's clearly the dumbass millennials that configured these so poorly?

      Baby Boomers - Destroying the ecosystem.
      Gen X - Destroying the global economic system.
      Millennials - Not giving any fucks because they are the worst paid generation.

      I'm glad you're focused on the the right things here. ;)

    • Re: (Score:1, Troll)

      by Narcocide ( 102829 )

      Because nobody competent would be using Hadoop in the first place.

  • by Anonymous Coward

    Big Data is, by definition, huge volumes of mundane data, usually in unstructured or semi-structured format, which have a very low density of interesting or useful information. But, when aggregated over 100's of TB, some useful patterns can sometimes be gleaned. Now, are the hackers going to ship the terabytes of data out of the datacenter and hope nobody notices what amounts to a DoS attack?

    Yes, there should be protection, but it's like heavy equipment and materials being left unattended at a constructio

  • will have to make a run to Best Buy for a few more thumb drives.

  • by Anonymous Coward

    My experience is a couple of years old, but when I did a deep dive into Hadoop a serious flaw quickly came to light:

    Hadoop was NEVER designed for security.

    Want to own a Hadoop server? Create an a hadoop account on your own box and connect to it. Bang, you are "root" on an Hadoop install.

    Hadoop installs should only be implemented in a secured environment and use restricted VPN connections into it. Anyone who allows the "Internet" to connect to a Hadoop install is an idiot.

    This security "flaw" in design is

    • I don't see that as being a flaw at all. Most software should be written like that.

      The problem is with the people who use the software assuming that random special purpose projects like Hadoop have planned for security or are competent to do so. Just assume it's all insecure unless there's good reason to think otherwise, and access it via vpn or ssh.

  • "According to Matherly, 47,820 MongoDB servers exposed only 25 TB of data. To put things in perspective, HDFS servers leak 200 times more data compared to MongoDB servers, which are ten times more prevalent..."

    Was this statement actually intended as a bragging point for MongoDB? I've looked at this statement several times, and I can't come up with any other spin. Seriously - if somebody threw this line out there trying to sell me on his preferred piece of software, I'd immediately leave and vow to never use

  • When you replace knowledgeable workers with a lets google it up mob. People use haddop just because big data, but in reality they don't know how to implement it correctly.
  • Fix it now or it costs you 2 orders of magnitude more when the (code) boat sinks.

  • How many of those servers are actually supposed to be accessible, and how many of them are accessible only because they exist on a network with insufficient protection and oversight?

  • by PPH ( 736903 ) on Sunday June 04, 2017 @03:50PM (#54547533)

    ... because, to date, nobody has figured out how to get data out of a Hadoop database.

E = MC ** 2 +- 3db

Working...