IT's Next Hot Job: Hadoop Guru 112
gManZboy writes "JPMorgan Chase and other companies at this year's Hadoop World conference came begging for job applicants: They say they can't find enough IT pros with certain skills, including Hadoop MapReduce. That spells high pay. As for Hadoop's staying power as a career path (a la SQL 30 years ago), IBM, Microsoft and Oracle have all embraced Hadoop this year. Maybe the best news of all: 'Intelligent technologists will pick up Hadoop very quickly.'"
I'll start now! (Score:4, Insightful)
After all, every other framework of the month has lasted for 30 years, Hadoop will have at least as much staying power as Ruby on Rails!
Re: (Score:2)
After all, every other framework of the month has lasted for 30 years, Hadoop will have at least as much staying power as Ruby on Rails!
I wonder if you can learn how to create and maintain security in less than a week.
Re: (Score:2)
it's simple - rent a cement mixer and wire snips - start with pouring it over everything until it is fully encased - then go around with the wire snips and cut any communication cables coming out of the system..
wait - did you want to use it? then how could it possibly secure if you allow users to use it?
Re: (Score:2)
Re: (Score:2)
Is that the latest excuse as to why the "security solutions" out there today are complete crap?
Re: (Score:2)
Is that the latest excuse as to why the "security solutions" out there today are complete crap?
Just put it in a VM. If it's in a VM it's secure, right?
Re: (Score:1)
Accessibility is an aspect of security
No, it isn't.
Accessibility is a concession security types make.
If a thing is DDoSd, the security is actually improved.
The PHBs will whine and cry, though.
Re: (Score:1)
Re: (Score:1)
Which is why the goal should be user-unfriendly. If they can't figure out how to use it, you won't have to worry about security issues.
Re:I'll start now! (Score:5, Funny)
Hadoop is geat, fast, and easy to use!*
*Statements are based on word count example and terrasort. Performance may vary greatly. May need to spend significant amounts of time to tune cluster for your particular data and applications to see any real performance. Applications may need to be specially designed to fit within the tuning constraints of the cluster. This statement does not apply if you are using binary data of significant size (BDOSS). Multiple data sets and apps may not perform equally well within the cluster. Data pre-processing, formatting, sequencing, and other such steps are not included in this statement. If you any problems, hope to $DIETY Google returns a hit. See your browser search bar for further details.
Bad learning resources (Score:1)
If you want a strong userbase, projects with good, easy to use learning resources do better. When you hit the hadoop main page, they tell you what it is, but not what you need to know in order to use it. They don't tell you what languages it supports. They give no examples of usage. Essentially, they don't do you any favours.
Re:Bad learning resources (Score:5, Informative)
If you want a strong userbase, projects with good, easy to use learning resources do better. When you hit the hadoop main page, they tell you what it is, but not what you need to know in order to use it. They don't tell you what languages it supports. They give no examples of usage. Essentially, they don't do you any favours.
I spent some time trying to implement some nice free tools from IBM and Apache. I found I needed to download X and do a build of it, but half way through it wanted Y to complete the build. OK... So I go find Y and try doing a build on it, but need something else from Apache, which doesn't like the vesion of Apache I'm running. So I get the other Apache thing and find I can't get it to start up. I go research it and find conflicting and incomplete information all over the web. I throw in the towel.
One thing needed is One source for information and clear instructions for a basic, default build of a platform. Once that is reliable, then document ways to add foo and bar or even plugh if it suits you.
Re: (Score:1)
Agree completely. Describes my experience with Linux - which I've used and liked, for the most part, since '95 - very closely. I now use a Mac as my primary machine. I get the close-to-the-metal experience - when I want it - with the ease of double-click installation.
Re:Bad learning resources (Score:5, Informative)
drink the maven kool aid, and you worries will be beyond you.
To use hadoop :
org.apache.hadoop
hadoop-core
0.20.205.0
in your pom.mxl
Then write 2 classes like those one:
class MyMap extends MapReduceBase implements Mapper<K1, V1, K2, V2 >...
class MyReduce extends MapReduceBase implements Reducer<K2, V2, K3, V3>...
Feed instances of those to a JobConf and feed that instance to a JobClient.
The rest should be obvious to a seasoned programmer, just by looking at the nomenclature of the classes hierarchy.
The great Ward Cunningham, is right, put two days into studying something and you are already half way to expert. [infoq.com]
Re: (Score:2)
There a similar maven recipe for deploying a hadoop cluster
no, not now, but if you need it, you could easily fork http://mojo.codehaus.org/wagon-maven-plugin [codehaus.org] so that it could be used like that:mvn install deploy:deploy deploy:execute-on-remote.
Hadoop is easy -- having the analytical skills to express a problem as mapreduce in the first place is the hard part.
Agreed, math is more useful, my halfexpertise enables me to assert that the right way to express a problem for hadoop is it to formulate it into associative (a+b==b+a) and distributive (a*(b+c)==a*b+a*c) operators. But I only have two day of self-education on the subject, if someone more experienced would like to enlighten us
Re: (Score:2)
You forgot your oozie workdflows.mxl!!!1
And then what about your PIG UDF's... and wait, I actually want to do it in scala, mkay?
Re: (Score:3)
I spent some time trying to implement some nice free tools from IBM and Apache. I found I needed to download X and do a build of it, but half way through it wanted Y to complete the build. OK... So I go find Y and try doing a build on it, but need something else from Apache, which doesn't like the vesion of Apache I'm running. So I get the other Apache thing and find I can't get it to start up. I go research it and find conflicting and incomplete information all over the web. I throw in the towel.
One thing needed is One source for information and clear instructions for a basic, default build of a platform. Once that is reliable, then document ways to add foo and bar or even plugh if it suits you.
Sounds like IBM all right. They make some decent products sometimes. I'm fairly certain that other times they go out of their way to make things a pain in the ass to use. Maybe it's supposed to be a joke on the rest of the world?
Re: (Score:2)
??? I just got a note from my manager on "big data", and decided to take a look at Hadoop. I downloaded the latest stable release, set JAVA_HOME in the config file and ran the example program. Total time to having a working instance: about a half hour, which included five minutes or so to download the tarball. Did you not see this page [apache.org]?
Re: (Score:1)
Java is one of the most inefficient languages ever? I take it you've never programmed in ruby, python, perl, etc. IIRC, Java benchmarks have shown it outpacing everything except for C/C++, FORTRAN and OCaml.
Re:Right. (Score:5, Interesting)
Java is one of the most inefficient languages ever? I take it you've never programmed in ruby, python, perl, etc. IIRC, Java benchmarks have shown it outpacing everything except for C/C++, FORTRAN and OCaml.
On first execution (and compile) it's slow. On first creation of an instance it is slow. After that Java makes up for itself rather nicely. If well implemented it's a great way to go, though I wouldn't chose it for my 3D rendering or reconciling a fiscal year's worth of journal entries, it's not that kind of language.
Re: (Score:2)
On first execution (and compile) it's slow. On first creation of an instance it is slow.
But it doesn't have to be slow ever! Microsoft .NET doesn't have most of those problems, despite being otherwise mostly identical. That's because Microsoft applied this fantastic new technology that apparently Sun has never heard of called a "cache".
This is why Java fell flat on its face in the desktop world, because Sun couldn't wrap their heads around that fact that every launch will be a "first instance" because having dozens of simultaneously running instances of a single process is very rare on desktop
Re: (Score:2)
though I wouldn't chose it for ... reconciling a fiscal year's worth of journal entries
You might be surprised how often java is used to do just that.
Re: (Score:1)
I take it you don't know what the phrase "one of X" means, because you have perfectly described what I mean by "Java is ONE OF THE most inefficient languages ever" by listing more of them.
Are people really this dumb these days?
Yes, yes they are.
All I hear is java this, python that. People care more about being able to throw shit on a wall and have it run than they do about performance, reliability, or functionality.
Getting the Experience (Score:5, Informative)
The trick is going to be getting the appropriate experience without having learned it on the job already.
Yes, it can be done. However, this technology is geared towards environments with lots of nodes in big clusters. (which can run Linux) That's not the same as simply learning a language.
I got a job utilizing a "Big Data" database technology by being at the right place at the right time, when this technology was being rolled out. It's also hard to find people with that specialized experience.
So I would suggest to companies, hire people and train them. Just get quality people if you can't find someone with the specific skill set.
Re: (Score:2, Insightful)
Re: (Score:1)
Re: (Score:1)
Just last year I was contacted by a headhunter demanding 5 years of Exchange 2010 experience...
Did you offer him a +1 Funny or -1 Troll? :-)
Re: (Score:3)
Oh, how I wish this was how it's supposed to work... many employers really REALLY -MEAN- having ten years of experience in the product that JUST came out. I remember getting through the 'gauntlet' only to have the interviewer get really pissed that I didn't have the experience. Told me I 'wasted his time and mine, and that he would make sure I would NOT be considered for any position in that company EVER, just for being such a liar'.
Needless to say, I told him how awesome he was - and never worried about
Re: (Score:2)
Maybe you just don't get that there are seriously ignorant people out there in positions that somehow determine your eligibility for employment. Especially your average HR manager.
I interviewed for a gig working for an MMO developer, the HR guy spent 45 minutes talking about how "This workplace culture eludes h
Re: (Score:1)
Re: (Score:2, Interesting)
That's OK, it won't stop moron head hunters from stipulating in the coming weeks that they only want Hadoop programmers with at least 5 - 10 years experience. I remember seeing that for Java programmers... in 2000.
Blame HR departments. They need to spend some time with the internal department which needs the guru. I remember having a good laugh in 1999 when some ads were run, looking for people with at least 10 years Java experience. The sick thing is the HR department or Headhunter will use that as a screening device and only end up with liars applying -- like the contractor we had for 2 weeks, who claimed to be an expert in a staggering number of tools and languages, despite a rather young age -- yeah, he had to
Re: (Score:2)
> I remember having a good laugh in 1999 ... how to write a date verification function
Y2K, good times. good times...
Re: (Score:2)
20 years experience including the following:
a.
b.
c.
d.
e.
f.
etc. etc.
Re: (Score:1)
I've seen some of those time-travel ads myself. My colleagues said of this practice, "just lie to the HR people. It's what we did to get here."
Re: (Score:1)
I recently inquired about a side gig involving that rare database skill. Apparently, they weren't interested in part time person with this skill who was willing to do remote work.
No, the person had to be on-site... for a 3 month contract. I just told them "good luck finding the right candidate". But as you guys said, they probably wind up with liars.
Re: (Score:2)
If you get an offer, why do you still need to assess your skills? Or did you mean an interview ?
IT needs apprenticeship with classes and real work (Score:2)
and NOT just CS classes.
Take a tech school class load and add apprenticeships to it.
Re: (Score:1)
People should just do a huge piracy network with it.
Re: (Score:1)
I understand. I'm speaking to getting the experience at all.
Re: (Score:1)
You can get a master/slave combo VMWare VM at http://www.cloudera.com./ [www.cloudera.com] They also have packages for Ubuntu, I made an at-home cluster of VMs with one master and a slave that I can replicate.
What about stability and uptime of their web?!? (Score:1)
general IT market fairly hot (Score:4, Interesting)
Re: (Score:2)
At its core, 'programming' is you 'telling' computers what to do.
Since you are doing 'IT support', 'SQL', etc. you are already 'programming'.
The real problem is the Babel effect of multiple, heterogenous computer 'languages'. So why limit yourself? Pick one (say Perl, or Powershell) and then depend upon CPAN or PowerShell libraries to do the heavy lifting for you.
Re: (Score:2)
You didn't just describe the "general IT market"; what you're describing is commonly called the Web 2.0 Bubble. The same people who funded the .com bubble learned nothing from that, and tech company startups have repeating the same sort of overvalued silliness that leads to a bust again during the last few years. When we have ridiculous things like Groupon being "valued" at billions of dollars, of course there's a bunch of money hiring to build more companies in that space. All of that combined is still
Re: (Score:2)
I was warning about what I see as high business risk around the current web+mobile boom (relative to traditional IT jobs), not making a moral commentary about either type of work. And I already work on open-source software that has real effects on people's lives, you cowardly troll.
Re: (Score:2)
I think it's bubble mindset. It was similar atmosphere around 1999 or so. Not sure how this time it's different... Just flashier with different key-phrases---all hoping to cash in on the massive growth that ``advertising dollars'' will be bringing in as the economy of the world unravels.
"Gurus" need not apply (Score:3)
Re:"Gurus" need not apply (Score:5, Funny)
If I were a recruiter, I would automatically be wary of anyone who seriously refers to themselves as a "guru" of $language. Sure, you may be good at writing code and may know a particular library inside out, but anyone who calls themselves a guru probably has a very overinflated sense of their importance and actual skill level. These also tend to be the people who have the right buzzwords to get past HR filters and then proceed to bullshit their way through interviews.
"It says in your resume you were part of the initial development team and wrote one of the first reference books on $language."
"That is correct, I was also part of a team which worked to ensure cross-platform consistency and stability. I've also written tutorials in $language and developed several application examples which are included in the reference website."
"Anything else you'd like to add?"
"I also have chaired the past two Worldwide $language development conferences and am teaching an Introduction to $language at the local community college."
"That all sounds very good, but what development experience do you have developing $language in $businessEnvironment?"
"None, really. I think this will likely be the first instance of its kind using $language in $businessEnvironment."
"Sorry to hear that. We're looking for someone with more experience. Thank you for your time, there's the door."
Re: (Score:1)
Re: (Score:3)
I have a guy who can write the hell out of C# and C++ but the only way you get anything out of him is to give me the most detailed SOW you can possibly provide. You try to get him to talk to any stakeholder, process owner or direct management and he's as useless as tits on a bull.
Which is why you have a Systems Analyst as the go between, or at the very least, a Project Manager. Two shops I have worked in have cordoned off the developers from the users (including and particularly external customers.) Give the coder direction and let him/her go to it.
Re: (Score:2)
Re: (Score:2)
which makes perfect sense. Just because you can make a lathe doesn't mean you have the necessary experience to build, say, an airplane turbofan engine, and if the company is looking for somebody who has turbofan experience, why would they rather hire the guy who built a lathe?
Re: (Score:2)
That's an apples to orangutans comparison. A better one would be, the company needs to hire someone who has built hinges using a lathe, why would they hire someone who's only experience is in designing and building lathes and teaching others to do so?
The correct answer would be, they'd be fools not to hire that person.
Re: (Score:1)
No, it's your comparison that makes no sense. You are assuming that designing and building a language is a similar experience to designing and building some business specific application.
I will not argue on a point that a person who is experienced and smart enough to design and build a language is likely a person who can build a business app, however this is not a person who has experience building business apps, and while he has experience building a language, this is not the experience that is required.
Co
Re: (Score:1)
That's an apples to orangutans comparison. A better one would be, the company needs to hire someone who has built hinges using a lathe, why would they hire someone who's only experience is in designing and building lathes and teaching others to do so?
The correct answer would be, they'd be fools not to hire that person.
They'd be fools to hire that person.
Let's think about this for a minute. Do you see any downside to hiring somebody who is clearly overqualified for the job?
How soon before this person finds the work uninteresting, gets bored, and then starts looking for a job elsewhere? If this person is over-qualified, that implies that they can (easily?) get a job that is more intellectually stimulating and better paying elsewhere. It's in everyone's interest that this over-qualified individual doesn't get hired.
Be were of Gurus before crying wolf (Score:2)
As the song goes:
Oooh ooh Loupgarou gonna get ya, betta run to the river or ya gonna be dead.
Novices and Experts? (Score:2)
Quoting the article: "The company (JP Morgan) has been working with Hadoop for more than three years"
Then the article quotes the experts:
"The good news is that Hadoop experts aren't born, they're trained. "I'm sure companies that train their workforces on Hadoop will derive lots of benefits," said Jeremy Lizt, VP of engineering at Rapleaf, in a recent interview. A data provider that has been using Hadoop for nearly four years, Rapleaf was among the earliest adopters."
What a difference a few months makes...
Re: (Score:2)
from TFA... (Score:2)
Wow they must be super experts!!!
Re: (Score:2)
Also from wikipedia:
On February 19, 2008, Yahoo! Inc. launched what it claimed was the world's largest Hadoop production application. The Yahoo! Search Webmap is a Hadoop application that runs on more than 10,000 core Linux cluster and produces data that is now used in every Yahoo! Web search query.[23]
go figure that out...
Re: (Score:2)
This might explain it a little better:
link [apache.org]
Big data in a small world? (Score:2)
Re: (Score:2)
The thing I always wonder about Hadoop is how important can it get? It's only useful if you have too much data for an RDBMS, right? It seems like only JPMorgan and other giant companies could make use of it. Am I wrong?
There's no such thing as too much data for an RDBMS.
There is such a thing as poor database planning and a shitty schema, though.
Re: (Score:2)
Re: (Score:1)
I believe it can be used to feed data into "Big Data" databases like Netezza, Vertica, etc.
So what you're saying is... (Score:2)
...all us job-seekers who are already familar with several other languages and/or frameworks should read the Wikipedia page for Hadoop, bullshit our way past the HR person, then learn Hadoop on the job.
Re: (Score:3)
Sounds like the way I got my first Linux-based job in '95, except I used newsgroups instead of Wikipedia.
Re: (Score:2)
I remember when I was in college, I was taking a class on social informatics (basically, sociology for computer nerds) and I still remember the professor saying, once you know how to be a developer, you can learn any language that's useful. So if you're ever called in for an interview, spend the weekend before boning up on the language. You won't ever need to be an expert in a specific language if you already know the core concepts of programming.
Jobs? (Score:1)
Is it a bad sign that I saw 'Job' and thought 'Not another Steve Jobs story...'? :P
At least the Jobs frenzy seems to be dying down lately.
The Future (Score:2)
SQL is a query language, not a database implementation technology. In the future Hadoop-style engines will probably be wrapped by SQL such that it will be an implementation detail or choice, similar to the MyIsam versus InnoDB choice in MySql.
I'm not saying this will make it a non-career, only that the career will morph to be more like that of an Oracle tuning specialist (who make good money still).
Re: (Score:2)
There already exist tools/frameworks to work with Hadoop and HBase using SQL :)
Hadoop Is Easy: MapReduce + plumbing (Score:2)
"hot" trend? (Score:2)
Re: (Score:2)
There are positions out there.
Most of the folks that are hiring Hadoop and HBase folks are doing it on the sly.
It's how I got my current job :)
Re: (Score:2)
Many large corps have databases like Netezza (IBM) or Greenplum (EMC). To get better deals on their contracts, they'd like leverage of having an ``alternative''... Hadoop is often seen as that alternative (similar architecture, different mind-set) that can potentially be shoehorned into doing similar things that Netezza or Greenplum does---and not cost $bazillion dollars.
they're all tools (Score:2)
After 20 years in the industry, in various forms, I've come to this realization: C++, Java, Hadoop, Ruby on Rails, PHP... all these things are the airgun and socket wrench and grinder and welder and all the other tools in the garage. What matters is if you have experience working on BMW's or Kenworths or IndyCars or Harley-Davidsons. In other words, have you written accounting systems, industrial control systems, customer-facing websites, etc. I don't want to work for someone who's going to hire me because
Re: (Score:1)
Who finds it difficult? (Score:2)
Does anybody actually have a hard time learning Hadoop? In my experience its pretty easy to pick up and go with.
XCPU Plan9 (Score:1)
No thanks, I will stay with my old friends v9fs, xget and xcpu =(
Can you not train your people in Hadoop (Score:1)
Looking for gurus seems like a needle-in-a-haystack proposition. Would it not be easier to take some of your current employees and train them on Hadoop? Assuming your employees are homo sapiens, they could be trained to deploy, develop applications with, and maintain Hadoop installations.
Re: (Score:1)
Looking for gurus seems like a needle-in-a-haystack proposition. Would it not be easier to take some of your current employees and train them on Hadoop? Assuming your employees are homo sapiens, they could be trained to deploy, develop applications with, and maintain Hadoop installations.
It is interesting. I've been messing with Hadoop a bit before speaking to my employer about it. We were using Sensage at the time performing data mining which it was sorta able to handle (they have a SQL like environment available). But performing joins has never worked properly (one of a few peeves I've had about the product).
About a year ago I went out to Hadoop training and built two small clusters of 10 data nodes each for work. Hive and some HBase running (and yes we can do joins in Hive). Pretty