Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Programming Science

'The Code Has Already Been Written' 253

theodp writes "John D. Cook points out there's a major divide between the way scientists and programmers view the software they write. Scientists see their software as a kind of exoskeleton, an extension of themselves. Programmers, on the other hand, see their software as something they will hand over to someone else, more like building a robot. To a scientist, the software soup's done when they get what they want out of it, while professional programmers give more thought to reproducibility, maintainability, and correctness. So what happens when the twain meet? 'The real tension,' says Cook, 'comes when a piece of research software is suddenly expected to be ready for production. The scientist will say 'the code has already been written' and can't imagine it would take much work, if any, to prepare the software for its new responsibilities. They don't understand how hard it is for an engineer to turn an exoskeleton into a self-sufficient robot.'"
This discussion has been archived. No new comments can be posted.

'The Code Has Already Been Written'

Comments Filter:
  • I expected more (Score:4, Insightful)

    by kikito ( 971480 ) on Sunday July 24, 2011 @02:58PM (#36864520) Homepage

    The abstract in Slashdot is pretty much the whole text in the linked post. The other 3 paragraphs repeat the same idea.

    • Re:I expected more (Score:5, Insightful)

      by rubycodez ( 864176 ) on Sunday July 24, 2011 @03:04PM (#36864554)
      The whole premise is stupid anyway. I've worked with plenty of scientists in national labs that turn out production grade, maintainable code; and programmers who didn't. The core issue is getting people who write code for reuse by others to follow guidelines, regardless of title or profession.
      • Re:I expected more (Score:5, Insightful)

        by NoNonAlphaCharsHere ( 2201864 ) on Sunday July 24, 2011 @03:14PM (#36864628)
        It isn't stupid at all. Lots and lots of scientific software is written by grad students worried about the results, and don't care about the quality of the code itself. Their idea of what is "good code" has no relation to what a programmer who's worked in a production environment would call "good code". And invariably they decide to include libraries from other grad students at other institutions of equal or lesser value. And don't even get me started about documentation...
        • grad students != scientists

          Code by students who are actually learning to code professionally will have the same issues.
          • Re:I expected more (Score:5, Informative)

            by 16384 ( 21672 ) on Sunday July 24, 2011 @03:50PM (#36864840)
            The bulk of scientific research is done by grad students (or others like them with various kinds of scholarships). The professors whose name is at the end of paper's author list guide and oversee what is done, but don't have time for the daily grind of research. Their main job is to teach and get funding.
          • Re:I expected more (Score:4, Informative)

            by calmofthestorm ( 1344385 ) on Sunday July 24, 2011 @05:06PM (#36865412)

            PhD students at tier 1 and 2 research universities are basically bottom-rung scientists-in-training (sometimes with UGs below them). For our first year or two we'll take a class or two a term, but the bulk of our time is spent doing research, writing and reading scientific papers, and presenting at conferences. For the last 3-4 years we typically take no classes and spend all our time doing research and teaching. We're professionals who make $20-$30k/yr depending on the location, plus full benefits and tuition waivers for any classes we do take. Expectations of workload are typically higher than entry level positions in industry (50-80 hours/wk, depending on the field and PI), and pay is obviously worse. The postdocs and professors do do some of the research themselves (especially when younger), but for the most part their time is spent directing the general direction of the research and applying for grants to fund it, doing the work (for free) to review and organize journals, and of course teaching. Most of us are aware we won't be going on in academia after the PhD, and I at least am okay with that.

            It's nothing like a masters or a undergraduate degree at all. We really aren't students in any meaningful sense of the word given the modern sense of college, aside from the fact that we'll get a degree in time. In Europe there are post-graduate degrees awarded after the PhD, so I guess you could call their postdocs "students" as well.

            It's completely different outside STEM, however, with PhD students typically earning little to nothing and sometimes having to pay tuition.

        • Re:I expected more (Score:5, Informative)

          by gmueckl ( 950314 ) on Sunday July 24, 2011 @03:43PM (#36864804)

          Exactly this! It is not about the education of the people writing the code. It's about the purpose for which it is written. I've done it all.

          As a scientist most software that I write is geared to solving the problem at hand, nothing more. Sometimes this can be 10.000 lines of C++ code, at other times a short python script or 10. Each time, the code serves as a sort of automation for something that needs to be done anyway (I could attempt to compute the simulation result myself, by hand on pen and paper, you know... if I don't die of old age first ;) ). Often, not a single thought goes into how to make this stuff reusable, robust or more generic. It works on the one machine it is ever going to run on and very likely nowhere else, because it does not matter. What matters is the program output, not the program itself.

          As a software developer I have to think differently. Software gets compiled, packaged and deployed elsewhere. It must run out of the box, never crash, give useful error message and recover cleanly if something bad happened. And amidst all this effort, there's a tiny bit of code hidden somewhere doing the actual work. All that matters is that the program behaves correctly, no matter what the concrete output is. I might not even be expected to understand what the output actually means - it's not my primary concern.

          See the difference?

          • Frankly, no. Your software is used to determine conclusions. It should be as available and legible as any other portion of the output or your conclusions are suspect.

            Else wise, you wind up with "Oops! Can't find it or don't know what that part does."
            • by Lennie ( 16154 )

              I do think he got the basic idea right:

              The end product of the scientist is the output of the program and the conclusions he/she derives from that.

              The end product of the professional programmer is the program and it will very probably need to be changed again at a later time.

          • I work for a group at NASA. One of our group's tasks is to take scientist-written code and wrap it for distribution to hundreds of remote sites around the world. We try our damnedest to run the code as-is, but fairly often have to modify it to remove stuff like:

            * Hard-coded input and output file and directory names
            * Small and arbitrary length limitations on file pathnames - I've run into buffers that were declared as 53 characters in length, probably because that was what they needed on their system
            * Larg

        • A lot of scientific software is run less than 10 times, often only once. It generates the result, end of story (well, go away and understand what you got). There really is no point in extensively recoding for reuse, checking all the consts are const, etc. Documentation of the form 'does X using method Y (numerical recipes page P)' is often enough - i.e. a couple of lines of comments at the top of the file. It doesn't have to look nice, it just has to be correct. And don't even get me started on optimization
      • The problem with that is recognizing what code is going to be reused by others and what isn't.

        I'm an aerospace engineer who writes a lot of code (and does so on the taxpayer's dime), and it is a struggle to find the right balance between getting something functional for the immediate task, and recognizing what will be useful for others later. Since its much more difficult to write the second variety (particularly if it needs to be generalized for as-yet unknown tasks,) its just as important to perform some

      • Re:I expected more (Score:5, Insightful)

        by icebike ( 68054 ) on Sunday July 24, 2011 @03:26PM (#36864704)

        The whole premise is stupid anyway. I've worked with plenty of scientists in national labs that turn out production grade, maintainable code; and programmers who didn't. The core issue is getting people who write code for reuse by others to follow guidelines, regardless of title or profession.

        Because you can point to a few (very few) exceptions does not make the story untrue in the vast majority of cases.

        Scientist code is usually a giant JUST-SO story, sufficient to derive the results they need for the task at hand.
        They either don't have, or avoid putting in data that will crash the program so limit checking is not necessary.
        Crashes are fine if they do nothing more than leave a trail of breadcrumbs sufficient to find the offending line of code.
        Output need not be in final form, and any number of repetitive hand manipulations of either the input or the output are fine as long as the researcher does not need to spend more time writing any more elaborate code.

        This is perfectly fine. The cabinet maker makes jigs. They are designed for their own shop and no one else has exactly the same saw and exactly the same gluing clamps. When the cabinet maker sells his shop, these jigs become useless. Nobody else knows how to use them.

        The scientist who takes the time to do a full fledged, fully documented, maintainable, fail-soft package for analysis of data that is unique to their project and their apparatus is probably not doing very much science, and probably not doing their intended job. That budgets force them into this situation is not unusual.

        It happens every day in industry, academics, and research. To hand waive it away by saying you know someone who delivers the full package merely calls into question your own understanding of the meaning of a complete, fully documented, maintainable, transferable, and robust software package.

        • The other half of my career dealt with in-house code writers. I can assure you they have the same problem, by and large, and worse code
        • Scientist code is usually a giant JUST-SO story, sufficient to derive the results they need for the task at hand.
          They either don't have, or avoid putting in data that will crash the program so limit checking is not necessary.
          Crashes are fine if they do nothing more than leave a trail of breadcrumbs sufficient to find the offending line of code.

          Funny — this could as easily describe how physicists often write mathematics.

          In this paper [aps.org] (the paper itself is here [caltech.edu]), Feynman notes that

          The mathematics is not

          • by icebike ( 68054 )

            Well said.

            And the corollary is that you don't need to know everything about a tool to use the tool.

            Spending any significant amount of time learning even a close approximation of everything about any given tool wastes the work product of civilization, unless you are the tool maker.

        • "The scientist who takes the time to do a full fledged, fully documented, maintainable, fail-soft package for analysis of data that is unique to their project and their apparatus is probably not doing very much science, and probably not doing their intended job."

          that's very debatable.

          total number of citations can be a big deal in academia and some of the most cited papers in existence are tool papers.

          BLAST netted the author almost 35 *thousand* citations.

          and that's a real measure of how many other scientist

        • by chrb ( 1083577 )

          Scientist code is usually a giant JUST-SO story, sufficient to derive the results they need for the task at hand. They either don't have, or avoid putting in data that will crash the program so limit checking is not necessary.

          Welcome to the worlds of in-house, bespoke and embedded software engineering. This issue is not limited to scientists - in every company I have ever worked at, "getting it done" was more important to management than "code quality".

      • yes at my first job (at a world class rnd organization) we built our of fluid dynamics software to model fluid flow and we did a ton of work back checking our code against real life.
      • by ceoyoyo ( 59147 )

        Research code should be written with good practices if possible but it should ALWAYS be rewritten when it becomes something more than research code.

        Research is trying things out to see if they work. The code will always be messy, confusing and convoluted to some extent. Taking that as a package and turning it into some sort of product is silly.

        I have a degree in CS and just recently helped commercialize something that came out of my PhD. The tech transfer company blew their budget hiring a development house

      • "I've worked with plenty of scientists in national labs that turn out production grade, maintainable code; and programmers who didn't."

        Do you know scientists who turn out production grade, maintainable code, even if they weren't specifically asked to? If you're writing a program to solve a specific part of the problem you're working on, and are only going to use it that one time (happens often in scientific fields and engineering), why would you bother to make it neat and maintainable, or even understandabl

  • by gomerbud ( 117904 ) on Sunday July 24, 2011 @03:07PM (#36864566) Homepage
    To a scientist, their software is simply a tool, a means to an end. Their results and discoveries are what they really care about. When it comes to reproducing scientific results for verification, it is actually advantageous that another group not use existing software. Another research group using the same faulty software, with the same hidden bugs, will likely come to the same incorrect result.

    Productization of software is a completely different exercise. You have to make your software work for a larger crowd on a plethora of devices. You actually have to consider how your software fits into the larger product lifecycle. The key difference here is that you have customers that you need to keep happy.
    • This tells us that academics view of software is incompatible with the commercial world. So all that teaching CS in universities does is train CS graduates to think the same way - that the code is the product. This goes a long way towards explaining why there's so much poorly documented, badly explained and crappily designed stuff out there. Because the people who write it have never been educated in the importance of productising it.

      While that shortcoming can be overcome in a commercial organisation, wi

      • by godrik ( 1287354 )

        That's a very narrow view. An artist makes painting but also know how to paint is wall white. An artist knows how to draw, but can also sketch to express an idea.

        I am an academic. And most of my code is thrown away after use. (Well, it is actually stored, in case I need it later. But I usually don't.) Moreover, most of my code is just there to tell me if a given approach work or does not work. Once I know, I don't care about the code anymore.

        Still I know how to write a proper code with a proper documentatio

      • by cynyr ( 703126 )

        A lot of the stuff I see/help write, is for modeling a physical product, the code is simply a way to guess how it will react in the real world within some margin. The code is not the product, it simply helps sell the product. These products might change in such a way that the model needs to be rewritten maybe every 10 years. By then the original writer is gone, 3 versions of microsoft office have come and gone, as have 6-15 versions of the company wide selection software.

        Basicly even if the old code was doc

      • "the people who write it have never been educated in the importance of productising it."

        Which is only important if your goal was productising it to start with.

        Which for the most part, it is not the goal of for-academic-goals produced code.

        You know, software can be written for different reasons.

    • by swillden ( 191260 ) <shawn-ds@willden.org> on Sunday July 24, 2011 @04:41PM (#36865236) Journal

      There's a nice concept devised by Ward Cunningham which captures this issue nicely: "Technical debt".

      Failing to put in the effort that makes code maintainable during its construction incurs a notional "debt" which the software carries with it. Future developers working on the code "pay interest" on this debt in the form of time wasted on understanding and modifying the crappy, undocumented code, or on fixing bugs that wouldn't have been present if the code were better. Sometimes, those future developers may decide to spend time refactoring, building tests or documenting, and those cleanups pay down the "principal" on the "debt". After their cleanup work is done, future work has smaller interest payments (less effort for the same results).

      Startups often deliberately decide to incur great amounts of technical debt on the theory that if the revenue starts flowing in they'll have the money to fund paying it down, but if they don't start getting some money the whole company will evaporate.

      For scientific research, it's pretty clear that it also makes sense to incur lots of technical debt in most cases, because there's little expectation that the code will be used at all once the research is complete. Even when that's not the case, I think few scientists really know how to create maintainable software, because it normally is the case. I don't see a lot of scientists spending time reading about code craftsmanship, or test-driven design, or patterns and anti-patterns, or... lots of things that at least a sizable minority of full-time software engineers care a lot about.

      I guess the bottom line, to me, is that this article is blindingly obvious, and exactly what I'd expect to see, based on rational analyses of the degree of technical debt it makes sense for different organizations to incur.

      • leverage debt (Score:5, Insightful)

        by epine ( 68316 ) on Sunday July 24, 2011 @06:50PM (#36866152)

        That's a great post, of the kind that saves me a lot of typing. You covered the first-order considerations brilliantly.

        What you missed was technical debt blindness, which has been around since forever. Books I read around the time of the Mythical Man Month talked a lot about maintenance syndrome: that the original development team would be regarded as brilliant for producing working functionality at tremendous speed (undocumented, with no error handling for edge cases), then the first maintenance team would all be fired as underachievers for adding hardly any new functionality in the first year or two.

        Turns out it's hard to erect a machine shop over top of adobe mud brick construction without adding some reinforcement to the structure, which usually takes a lot longer than the entire original edifice.

        You can instead take a wrecking ball to the first iteration, but this rarely works out as well as hoped. You end up with far more ambitious adobe mud construction built with a whole new generation of unproven tools. At some point you have to bite the bullet and ferment what you began with.

        People hide debt blindness behind widely divergent construals of simplicity, where "simple" usually turns out to be a euphemism for any decision that sidesteps paying down debt in the short term.

        For professional software engineers, there is one true simplicity to rule them all: generativity and compositionality. Can you build the next layer on top with any hope of having it work and able to support an ongoing stack? For us, it's a long term game of pass the baton. For everyone else (management, scientists) the endgame is to cash out, and take credit elsewhere (e.g. publication biography).

        Unfortunately, a citation is not a formal linkage that the compiler either accepts or rejects. By the standards of compositionality, citation is payment in dubious coin. Citation is not falsifiable. Scientists still count their citations even when they come from papers that are full of crap, peer review notwithstanding. For a professional software engineer, when you start instantiating objects from one library inside an abstract expression template library, you come face to face with compositionality in a way that few scientists can even imagine, having weened at the outrage of being improperly cited.

        Technical debt blindness on the part of management quickly turns a software engineering shop into a highly non-linear fiasco. We've all seen this.

        Somehow this game works out better (for the participants) when played by bankers with leverage debt. But now it's my turn to pass the baton, since that deserves a whole lot more typing and I've done my bit.

  • This assumes people are very clearly an engineer/programmer OR a scientist. But I would consider most software engineers to be computer scientists as well. Its a fairly nonsense distinction. The analogy to spiderman and doc ock is fun, but ultimately metaphor don't prove anything.

    "Programmers need to understand that sometimes a program really only needs to run once, on one set of input, with expert supervision. Scientists need to understand that prototype code may need a complete rewrite before it can be
    • by plover ( 150551 ) *

      The point is that it's a one way street. Software engineering is a specialization of engineering science, but most scientists aren't software engineers. A scientist can create the embodiment of an algorithm representing a solution to their problem, but don't think of it in terms of the qualities of reusability, modularity, interface, coupling, cohesion, exception handling, security, data integrity, etc. And they aren't supposed to: they're trained to understand biology, botany, physics, or whatever their

  • by digitrev ( 989335 ) <digitrev@hotmail.com> on Sunday July 24, 2011 @03:11PM (#36864600) Homepage
    I work with Monte Carlo code and statistical analysis software. I use CERN's ROOT package for the stats analysis, CERN's GEANT4 for the MC code, and *nix scripting when I need to handle multiple files. Every single piece of code I write is written for a purpose. That purpose is generally to generate data and then analyze it. The only other people who are going to see it? Maybe my supervisor, and, if I'm just in on a contract, maybe the guy who has to work on my code later. But to be blunt, that doesn't matter. All that matters is that I know what's going on.

    That being said, sometimes I write software for my own personal use. There, I tend to write more robust code, trying to follow various programming standards. Because I figure, if I write something for myself that turns out to be fairly useful, someone might want to use it, or adapt it. But professionally, all my code needs to do is get out that table or prepare that figure. Is it sloppy? Yes. Does it get the job done? Also yes. Fortunately, not only is my field esoteric, it's also government work, so it's practically a guarantee that my code will never have commercial release.
    • by zrbyte ( 1666979 )
      Based on my experience, the amount of work that I put into creating quality code is dependent on the task at hand. When I know that the script, software will only be used once or twice (prepare the graph, etc.) it's not worth it to put a great amount of work into it to make it usable. In these circumstances I mostly adhere to the Klingon coding rules [smart-words.org]: "A TRUE Klingon warrior does not comment his code!"
      Now, it should be noted that, sloppy code means: usability is utter shit and should not be confused with
  • This is so true (Score:5, Insightful)

    by LordNacho ( 1909280 ) on Sunday July 24, 2011 @03:12PM (#36864612)

    You can often tell whether someone is "programming as a means to an end (of your own)" versus "programming to build a tool for someone else". For instance, I have experience in the financial industry. Quite a lot of traders see coding as a means to implement their cool new model. Looking at their code, you can often tell. It's as if everything was built to just exactly fulfil the requirement, with no thought to the fact that those requirements might change. But of course, they do change. So you get hacks and workarounds, and cut'n'paste cargo cult code. Kinda like what those Orks in Warhammer 40K might make. And of course the problem with spaghetti code is that if you write it, nobody can ever help you solve problems/improve it. It's the coding equivalent of painting yourself into a corner. There's loads of smart traders out there with an excel spreadsheet that actually is an extension of their personalities (In fact it's their Magnum Opus. Everywhere they go, they try to take this quirky little file with them). Every little hack is something only they can explain (comments, yeah right. Do your body parts have explanatory comments?) and only they can fix if wrong.

    On the other hand, you sometimes hire a guy who is a programmer, but knows nothing about the domain. Very good with OO models and that kind, but you have to teach them everything about finance. What's a settlement date, what kinds of options exist, etc. You get what you ask for, because they know how to turn problems into object models, but you have to ask VERY carefully. And teach. Unfortunately, not everyone has time for that, and so you end up with something that still doesn't quite do what it's supposed to.

    So you often end up gettings guys who understand the problems, but can't program, programming. And guys who can program, writing the wrong program.

    • by MacTO ( 1161105 )

      Many of those people who "can't program" actually can program. They simply understand the program's requirements. Maintainable code is not always a requirement since a lot of software written in research labs is intended to be written once and run a handful of times.

      It's also worth noting that properly structured code from a programmer's perspective is not always the same as properly structured code from a scientist's perspective. "Turn(ing) problems into object models," may be the last thing that scient

      • Maintainable code is not always a requirement since a lot of software written in research labs is intended to be written once and run a handful of times.

        While those of us who write programs on a regular basis know that often there's nothing more permanent than temporary code.

    • You get what you ask for, because they know how to turn problems into object models, but you have to ask VERY carefully.

      As a developer (and one in the financial industry at that), if there was one single thing I could suggest that would have the biggest impact on software development, it would be this:

      Enforce a 'Good Requirements' only policy.

      This may require a lot of training, and a great deal of rejected requirements for not meeting the standard, and cause great grief to those who think their 15 second explanation of an 80+ man hour feature implementation is sufficient. It

      • Isn't this short-term thinking endemic to the financial industry in the first place? I mean, programmers joke about unmaintainable code as personal job security -- but hard-charging finance guys would actually act on that.

        I assume they understand the incentives of their job, and have the training and/or personality to ruthlessly focus on that. They have bonuses and commissions to collect, yes? They get no additional income from well-written code, or something that assists their replacement next quarter or n

  • This is hardly unexpected. The code needed to process data from science experiments can be years in the making by one or few persons sculpting it to do the job they need done. It might be a bit much to say that it's throw-away code, but once the paper is out the door it probably won't see much use again.

    All of this combined with the fact that the coders are scientists and thus aren't concerned with UI issues and whatnot make it so it may take a lot of manual intervention at various steps to use the softwar

  • And I gotta say - the linked blog post makes me think the author just got in an argument with his scientist boss, and he lost.

    • *shrug* different tools for different purposes. I'm a graduate student, and writing good, scaleable, maintainable code would take far more time than I have, and is not what I'm paid to do. I'm paid to produce results. I've worked in industry before building huge infrastructure systems with many other people, and it calls for a completely different product. I'd go so far as to argue different skill sets. But speaking as someone who knows how to write "good" code, it's a waste of time for most academic applic

  • As a university researcher in applied game development I pretty much work on abstracting and generalizing *finished* software.

    I usually do this: I spend between six months and a year building a game according to some technique, framework or new language I am researching. The game is then finished, published and even sold. Then a paper is written describing the technique and its inpact. Lather, rinse, repeat.

    This is just anecdotical experience, but in this day and age of shrinking research budgets it is not

    • by prefec2 ( 875483 )

      With scientists they do not mean engineers. They mean only mathematics, physics and its derivatives.

      • Computer scientists? I'm a programming languages guy, type systems and declarative/functional programming. I hardly see myself as an engineer and I've never published at an engineering conference...
        • In software development things become more and more planned and predicted and tested over the last decades. Something which was more or less an art is becoming a set of established techniques. So software development becomes more and more and engineering task. On the other hand. Software developers and designers are always trying to use new stuff because the problems of today cannot be solved with the technology from 10 years ago.

          I was on an software engineering workshop on modeling and domain specific lang

  • The days of long-lived software are pretty much gone. There are a handful of companies that still maintain the programs they've written a long time ago, but most programs written today are written quickly and dirtily, to spring up one day and fall into oblivion the next. "Apps" are little more than short fads that come and go, easy to implement due to having little functionality, and just as easy to discard for the next one.

    • by sjames ( 1099 )

      I have dealt with a number of HPC programs on a regular basis that are old enough that they still refer to the input data as a "deck". They will never be completely rewritten and even small changes are few and far between because of the nightmare of re-validation. Unfortunately, because they started life as one-off research code, they are also fantastically sensitive to changes in the compiler due to accidentally depending on implementation quirks rather than the standards.

    • Maybe you overlook the longer lived stuff because it's everywhere and we're used to seeing it. What about the operating systems, languages and utilities your computer are using? your office applications? the major database management systems? What about the webserver software? what about commerce, banking, healthcare, MRP, ERP systems?
  • This has nothing to do specifically with scientists, this is more about the difference between code you write for your own use versus code you write for others to use. Scientists aren't the only people who write code for their own use!

    Conversely, scientists often do write code that needs to be shared, sometimes among large groups. I used to work in the field of experimental high energy physics, which typically have collaborations of hundreds or even thousands of people. Some of the software I worked on w

  • by smcdow ( 114828 ) on Sunday July 24, 2011 @03:34PM (#36864746) Homepage

    The issues surrounding transitioning research S/W written by scientists into honest-to-goodness production systems are ones I'm very familiar with.

    At my company, a lot of energy has been put into bridging the gap over the years with varying results. I believe that the root cause of the problem is that research S/W is not an end-product; typically for scientists the end-product is a research paper, white paper, proposal paper, etc., for which the S/W is only a tool for getting to the end-product. As soon as the experimental (or proof-of-concept) S/W returns the desired results, the software is considered "done".

    In contrast, production S/W is often THE end-product for developers, so a lot more attention is given to robustness, re-usability, etc. All the standard thinking that you want to go into your production S/W.

    One big issue for us is that the research S/W is almost always written in Matlab, while the production code is written in C++ and Java. The single largest source of bugs in our systems is porting S/W from Matlab to C++ or Java. (As an aside, please let's not talk about the Matlab 'compiler', nor Octave. -- we've already tried them both, and they're both performance hogs and also create SCM and CM nightmares).

    We experimented with requiring that the research S/W be written in C++, but it was a disaster. The scientists couldn't get anything done, and the code was just awful. So, back to Matlab it was.

    And, my experience is that people who I have a great deal of respect for, who I consider brilliant in their fields, holding PhD's, etc., have produced the crappiest Matlab code I've ever had the sorrow to read. My favorite instance was the use of these local variable names within a single function of research S/W that was considered "done" (true story):

    i
    ii
    iii
    iiii
    iiiii
    iiiiii

    And, of course, little documentation as to the mechanics of the code. And believe me, it gets worse from there. Bear in mind that the code does indeed work for its particular purpose, and may well be ground-breaking in that particular research domain. But "done"? Ready for production? Not without a major porting effort (which is really a re-writing effort). The most mysterious thing to me, though, is that the scientists, for all their intellectual firepower, don't understand that it's a problem.

    The solution we've converged on is to require our bizdev to be responsible for funding efforts to rewrite the research code and get it integrated into the product baseline. And, the bizdev types can't proclaim a particular capability "done" (eg., sell it to customers) until they've funded and executed those efforts. It took years of education to get to this point, but things are moving along much better then before.

    • The solution we've converged on is to require our bizdev to be responsible for funding efforts to rewrite the research code and get it integrated into the product baseline. And, the bizdev types can't proclaim a particular capability "done" (eg., sell it to customers) until they've funded and executed those efforts. It took years of education to get to this point, but things are moving along much better then before.

      A couple of thorough re-orgs should be sufficient undo all that.

    • by Titoxd ( 1116095 )

      Have you considered using Fortran? Matlab is much closer to Fortran than C (arrays start at 1 instead of 0, to pick a rather annoying bug), has plenty of libraries available (just like C) and the newer versions of the language standard are not the spaghetti-laden soup they used to be.

    • I've worked with maths/science types before to integrate their formulas into linux-compatible production code.

      It's usually been pretty small formulas (a couple of screens full of code at most), and I don't understand most of it, being a programmer and not a maths guy. Also, we're a pretty small business, and these haven't been under any major deadlines (it fell more under R&D), so I had time to do this properly.

      Anyway, what I've done in the past is these steps:

      1. When their matlab version is done, ask t

  • Most scientists (e.g. physicists, chemists, mathematicians, geo-*) solve their problems with formulas. Then they code these formulas in a coding language which is most likely C, Fortran, Algol68 (not really) or Mathlab. While programmers often also only code, software engineers try to design software and have to incorporate different aspects. This is even true when writing software for the sciences. However, the same apply to all the other fields we write software for.

  • There are programmers and there are Software Engineers.

    The two things are different, and people who don't know any better equate them.

    There's nothing wrong with being a programmer at all, but programming is a subset of Software Engineering.

    It's akin to the difference, imho, between a construction worker and an architect. One can be a hack or a craftsmen, but tends to have a smaller overall picture of the where/what/when/why behind decisions that often seem unimportant or superfluous. The other can be inco

  • Punching down the logic was the easy and fun part. Exception handling is the main challenge. Then come middleware issues.

    It's easy to disassociate yourself and to become a patronising git and to claim to have done the hard work. Making software maintainable, supportable and well performing is never a matter of course.

    Generally speaking it is hard for a non-programmer to imagine a programmer's job. The tedious thing is that scientists will always have significant influence and may not appreciate the ha
  • "All programmers are optimists. Perhaps this modern sorcery especially attracts those who believe in happy endings and fairy godmothers... But however the selection process works, the result is indisputable: 'This time it will surely run,' or 'I just found the last bug.' So the first false assumption that underlies the scheduling of systems programming is that all will go well, i.e., that each task will take only as long as it 'ought' to take. The pervasiveness of optimism among programmers deserves more th

  • In all the (big-pharma) shops i worked at, i'd write and test the command-line number cruncher inside (until my boss could get a paper or two out of it) then hand it to "two guys" that would slap on a stunningly restrictive (in terms of functionality) GUI (itself a third party tool set based on Qt) and it'd sell just fine ...no sweat [shrug]
  • It is like saying:

    "DIY fixers do a hack job of wiring their routers in their home basements to their computers in second floor bedroom. They drill a hole and take the cable clearly marked "indoor use only" outside the home hanging in a lazy lopsided catenary curve up to the bedroom window, take it through the window into the house. The window sash does not close properly and allows bugs to get inside.

    Professional electricians on the other hand use flexible drills to make nice access holes, wire the cab

    • better would be to find a good common vertical point and then

      1 drill into the floor of the second floor and drop the cable down
      2 drill into the floor of the first floor (and repeat)

      bonus points if you can do a wall fish (tape a poker chip to a cr2032 and an LED then attach that to fishing line)

  • I've experienced almost the same thing, but with engineers instead of scientists. I attributed the engineer's disdain for software quality to a different motivation. Namely, the disparity in status and pay between engineers and programmers. I believe that they felt that spending extra time on making the software readable, maintainable and all those other ables was beneath them.

    The proof of the pudding came when I happened to hire an inexperienced guy as a programmer. He was so smart that he soon learne

  • This problem is not just present in these two domains. You see this dichotomy elsewhere, specifically in IT.

    I do IT in a scientific research-oriented organization, having taken over for previous staff members who were very much of the "IT should be done like research" school of thought. The result was that each problem was addressed quickly and without any consideration for the whole. Being as they were working with physical assets and not just software (though there was a lot of that, too), the end result

  • This is a really common problem in the academy. So common in fact, that one particular academician has come up with a special license, the Community Research Academic Programming License (aka, the CRAPL). It's worth a look and good for a chuckle:

    http://matt.might.net/articles/crapl/ [might.net]

  • The job of a scientist is to come up with new ideas and test them.
    In that job, code is a tool, like a hammer or a mass spectrometer. If the tool works well enough for the job at hand, why on earth would you spend time making it work better ? It is just crazy.
    The other problem is that scientists are arrogant (so they think what works for them is ok for others) and non scientists are stupid, as they expect scientists to do their work for them - in this case, production code.
    sigh
    It is not a scientist job
  • This is true, and starting to become a problem given the increasing expectation that as much as possible of our scientific work should now be open source. While this is great in theory, in practice, it means I (as a scientist) am under pressure to make my scientific "exoskeleton" code publicly available. I'm not qualified (and don't have the time) to polish it up into a product that is really suitable for distribution, and my employer doesn't have the funds to hire programmers to do this for every piece of

  • Bullshit. False issue. Move along. There is nothing interesting here.

  • I wish it were as simple as this thread implies. The truth of the matter is that most commercial developers who are paid to worry about maintainability don't understand how to do it much better than their academic counterparts. Managers notice this and put all kinds of process in place to enforce good practice--requirements and design docs that are practically books, compile-time coding standard tests, smoke tests, regression test suites, automated tests and so on and on and on. These do not, however, turn
  • It's naive to consider either class of software as being sufficient, or either kind of programming to be superior. Like most problems there is a strong management component to assigning resources to each in appropriate scales.

    A computer scientist/software engineer delivered a well-phrased summation of half of this discussion during a 5-minute talk at a recent lightning software session at a science meeting. (Note that there are rarely science sessions at software meetings.) A domain scientist/software en

  • As pointed out by others examples of the reverse can also be found in practice. However I agree with this generalization to be true for coding practices. I also like to add another related generalization. Autodidact developers tend to code quick and dirty (with a lot of experience how the actual code run in daily practice under heavy loads etc), people with a heavy academic background (also depends on the specific university) tend to code slower and cleaner.

    That said, in business I often hear some develo

To communicate is the beginning of understanding. -- AT&T

Working...