Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Python Government Programming

What America's NSA Thinks of Python (zdnet.com) 74

"Now budding Python developers can read up on the National Security Agency's own Python training materials," reports ZDNet: Software engineer Chris Swenson filed a Freedom of Information Act request with the NSA for access to its Python training materials and received a lightly redacted 400-page printout of the agency's COMP 3321 Python training course. Swenson has since scanned the documents, ran OCR on the text to make it searchable, and hosted it on Digital Oceans Spaces. The material has also been uploaded to the Internet Archive...

"If you don't know any programming languages yet, Python is a good place to start. If you already know a different language, it's easy to pick Python on the side. Python isn't entirely free of frustration and confusion, but hopefully you can avoid those parts until long after you get some good use out of Python," writes the NSA...

Swenson told ZDNet that it was "mostly just curiosity" that motivated him to ask the NSA about its Python training material. He also said the NSA had excluded some course material, but that he'll keep trying to get more from the agency... Python developer Kushal Das has pulled out some interesting details from the material. He found that the NSA has an internal Python package index, that its GitLab instance is gitlab.coi.nsa.ic.gov, and that it has a Jupyter gallery that runs over HTTPS. NSA also offers git installation instructions for CentOS, Red Hat Enterprise Linux, Ubuntu, and Windows, but not Debian.

This discussion has been archived. No new comments can be posted.

What America's NSA Thinks of Python

Comments Filter:
  • public domain (Score:3, Insightful)

    by mmjo ( 1396615 ) on Sunday February 16, 2020 @08:04AM (#59732768)
    I wish that government bodies put these kind of texts in the public domain.
    • Their bugs are in the public domain. Just break open a motherboard. :-D

    • I wish that government bodies put these kind of texts in the public domain.

      That is not going to happen.

      When the government doesn't like citizens being uppity, they will find subtle ways to get even. One way to do so is to take electronic documents (such as PDFs), print them out, then scan them back in as rasterized images, and then provide the citizen with the information in a bloated and far less useful format. That is what the NSA is doing here, so they clearly don't like giving out this information.

      Government prosecutors use the same print-scan-send technique when giving disc

      • This tends to be a side effect of the redaction process, in an effort to makes sure redacted text is not recoverable.

    • Comment removed based on user account deletion
      • FTP? You're a funny old man. That's been obsolete for decades due to unmanageable security flaws.

  • by Slicker ( 102588 ) on Sunday February 16, 2020 @09:07AM (#59732834)

    I have been coding since the mi-1980's, starting with BASIC, Assembly, then C, then C++.. and on through many other languages today being mostly JavaScript (browser-based and Node.js), C#, Python, and SQL.

    Python always feels to me like a "baby language". That's ok because it does serve its niches reasonably well. It's easy to learn, easy to read, and it has libraries for working with most kinds of things that it would be too slow for, such as processing large lists of things. It certainly also has its ugly parts, also.

    Using libraries, it's ok (not great) for data processing (requiring high performance and flexibility over large data sets) and is actually quite good for business logic (requiring simplification and clarity).

    However, there hasn't been a decent language built for data processing or business since COBOL. SQL quickly becomes insanely complicated when evaluating problems that span multiple inter-related rows. Graph databases handily solve nearly every problem one can imagine, particularly Neo5j, but the query languages/methods themselves add complexity. Other NoSQL databases also tend to lack in expressiveness and performance with regard to flexible analysis. For example, MongoDB is fine if you keep all your data organized in specific documents and don't anticipate any substantial complexity between them. Their developer's comment that you should "do your joins at design time" illustrates that they just completely don't get it. Also, the JSON-based query language is hard to look at, even with simple queries.. MongoDB really is for front-end developers of simpler, isolated or non-enterprise applications. It will not do for data warehousing or sophisticated report capabilities, such as from data cubes, etc.

    I actually did some work using Microsoft JScript that really impressed me for data processing and business logic. JavaScript data structures are quite versatile, fast, easy, and clear to work with (once a new user understands pointers). And in Microsoft JScript, you have synchronous access to databases like SQL Server. Although this does not perform nearly as well as Node.js, it provides for the ability to specify synchronous business processes and abstract them into sub-functions. You cannot do this in Node.js. The performance hit to execution of JScript is not too much in exchange for the vastly greater developer performance and reduced risk of coding mistakes. There are no race conditions. A completely lack of synchronous database access is the only thing holding back Node.js from being the overall best data processing and business logic language of all time, in my opinion.

    Actually, I wrote my own Tree Graph database that uses the same data object access syntax as does JavaScript. I simply extended it a little. Mainly, I enabled query notation with square brackets. For example,

            myorg.projects[ department = "mydept" and budget.projection > budget.authorized ].name,manager,budget.projection-budget.authorized;

    Currently, I am working (in what spare time I have) to fully integrate this with my p2p mesh network. A dream, if I had the time, would be to fork Node.js such that this is its native memory system. In this case, I would enable persistence to conditions for data arrays and objects, the default condition being normal JavaScript variable scoping. Otherwise, they would be persistently stored in the mesh. So for example, you could specify a certain data object to persist only while a certain process is running or until some event occurs.

    The mesh "charges" for use of computational resources. One's computer tied to the mesh provides computation resources, like memory, disk storage, computing time, bandwidth, and anything else custom (e.g. interface to human user or interface to static IP address). So you get "paid" for what you provide to others and you can use those credits to use the resources of others, in exchange. Prices are strictly and mechanically supply and demand with also some taken for mesh overhead.

    I talk too much.. sorry.. It's sunday morning and I need to get back to work on that.. it's what I am working on now... other than slashdot.

    Thanks,
    --Matthew C. Tedder

    • > SQL quickly becomes insanely complicated when evaluating problems that span multiple inter-related rows.

      Spend an hour learning common table expressions and try using them next time. They largely solve that. They are similar to functions in other languages.

      Without CTEs you can have statements embedded inside statements embedded in statements six levels deep, like this:

      Query1 join
      Query2 Where
      Query3 join

      • "Python is the closest thing there is to Lego in programming languages."

        Squeak

        • "Python is the closest thing there is to Lego in programming languages."

          Squeak

          Just do not step on it when the lights are out.

      • Having said I wouldn't generally build enterprise-grade systems with Python, I should acknowledge where I do see it as useful.

        It can be great for a script you are going to run once, or little utilities for use on your personal desktop. For these, you don't invest in the kind of practices you use for production systems, peer review and all of that. Quick and easy fits here.

        Python also happens to be good with large integers. I've used it to break encryption, where you are dealing with 128 bit and larger nu

      • by Kjella ( 173770 )

        You CAN do the same with temporary tables, but then you've started doing imperative scripting in a declarative language. That will ruin performance as well as screwing up your ability to think declaratively.

        If you have a query to fill a temp table transforming it to a WITH clause or the other way around is child's play.

        INSERT INTO #temp1 SELECT [long ugly query]
        INSERT INTO #temp2 SELECT [query with temp1]
        SELECT * FROM #temp2
        vs
        WITH [long ugly query] AS cte1, [query with cte1] AS cte2
        SELECT * FROM cte2

        One enforces an order of execution and the other doesn't, but all the hard bits of figuring out if the query actually returns what you want is the same. My experience has been the opposite of yours, you can throw e

        • True, you absolutely CAN partially fix your performance problems (temporarily) by throwing out the entire concept declarative programming and writing a imperative script instead. That works okay until something changes. Of course, that brings up the question of why you're using SQL, if you want to write imperative scripts. It's kinda like finding a way to haul bricks on a motorcycle at that point.

          It's also a very sub-optimal solution because temp tables themselves are dog slow - just about the slowest op

        • > it's like showing your work in math not just the final answer on the dotted line

          If you think that through, you're suggesting that this:

          SELECT * FROM @managers

          Is easier to read than this:

          SELECT * FROM managers

          You really think the @ sign makes it a lot more readable?

          That's the difference between using a CTE and a temp table. The @ sign, which forces the server to run that part of the query NOW, and write all of rhe results disk - even when it's not going to use half the results because there is an inde

      • Another great thing about SQL is that, like COBOL, you have convenient access to decimal math, the stuff that humans use.

        • What programming language are you thinking of (besides assembler) where you can't do decimal math?

          That said, I like Tzeltal math. Base 20. Those Mayans really knew what they were doing.

          • No, Mr. Dumb Ass, writing your own math routines is not a serious answer.

            Go and do somebody's accounting in base 20 and see if you get the same number of bugs as somebody using decimal math. And see if the code auditor gives you high marks for being clever.

            Your attitude is the exact type and character of stupidity that leads to it being a good idea to do accounting in COBOL. Anything other than decimal math is too hard in COBOL, so you won't screw the accounts up with a float.

            Programming having to do with k

    • by owlaf ( 5251737 )
      I have done a decent amount of text parsing and log file generation. I have done this with c, c++, java, and python. c++ and java have good amount of libraries needed to accomplish this, they were similar to the libraries available in python. I get the impression the real advantage is that a compiler is not needed for any small quick changes
    • by orlanz ( 882574 ) on Sunday February 16, 2020 @11:31AM (#59733060)

      Examples please? I have done many data processing programs that hold a few (5-16) GBs of data in RAM at a time over the years (mid-90s). I am not talking about RAW data, but just the information that needs analysis.

      I honestly haven't found anything that comes close to the flexibility, turnaround, and dev speed as Python or Perl. Yes, C, C++, C#, Java, etc do the analysis faster (20% for C) and I can crunch that 5GB down to 3GB with proper C structs, but loss in overall time & speed for changes put the Ps in front. Of course the data processing isn't pure Python, the heavy number crunching is entirely C (5% of the process) and many of the analysis libraries also have their heavies in C/C++, but overall... you get the dev speed and flexiblity of Python with barely touching C.

      There are also other inefficients that I add to provide clarity in the process overview & juggle system resources. Like temporary Sqlite instances, remote SQL servers, status/data messages between instances, etc. But Python just makes glueing all these various tools together so easy that it doesn't really add complexity but does allow various tools to do the part they are good at.

      • by ShanghaiBill ( 739463 ) on Sunday February 16, 2020 @12:40PM (#59733196)

        I have done many data processing programs that hold a few (5-16) GBs of data in RAM

        The size of the data isn't the issue, it is the size of the program. Python is great for programs of less than a thousand lines, worked on by one person, and that have no serious consequences for a runtime error (nobody dies).

        When you move out of that comfort zone, and into million line projects, developed by teams of people of varying levels of competence, and with serious reliability and performance requirements (plane shouldn't fall out of the sky), then Python is not the right tool.

        • But at that point "million line projects, developed by teams of people of varying levels of competence, and with serious reliability and performance requirements (plane shouldn't fall out of the sky)" you also put dev speed and flexiblity firmly into the back burner because other things take priority. And in all honesty in most of these cases it seems like a scripting language of any kind would be out of the question from day one.
      • I honestly haven't found anything that comes close to the flexibility, turnaround, and dev speed as Python or Perl.

        Until you need to refactor or work with lots of people....

    • Your graph query language resembles XPath. Once overcoming my sql paradigm habit, it seemed elegant to me.
    • Graph databases handily solve nearly every problem one can imagine, particularly Neo5j, but the query languages/methods themselves add complexity.

      I'm with you there, graph databases are great. Their only drawback is that finding people who know how to work with them is hard.

      is the only thing holding back Node.js from being the overall best data processing and business logic language of all time, in my opinion.

      I don't see how you can get from COBOL to this. COBOL is better than Node.js.

  • The "designate a block of code by the amount of indentation" aspect is a pain to get used to.
    • The "designate a block of code by the amount of indentation" aspect is a pain to get used to.

      Initially it is annoying, but you do get used to it within a few hours. Personally I believe it is the least of Python's problems, though it does appear to be the one that most people complain about.

      • The "designate a block of code by the amount of indentation" aspect is a pain to get used to.

        Initially it is annoying, but you do get used to it within a few hours. Personally I believe it is the least of Python's problems, though it does appear to be the one that most people complain about.

        That's because it's dumb and unnecessary; implemented as a whim by it's creator. Using actual delimiters to delimit blocks has no downside. I'd be fine with Python *allowing* whitespace-delimited blocks as long as they weren't required. I think it's a nice feature for short and/or logic (not syntax) learning purposes -- pseudo-code that runs. Longer, more complicated things benefit from more, not less, defined structure.

        • ... Longer, more complicated things benefit from more, not less, defined structure. ... --- Agreed.
        • That's because it's dumb and unnecessary

          No, that's because it's immediately obvious. It's an easy target without having to dig deeper. This is a benefit both in making the objection and in finding a large audience that shares that understanding. It's a far easier argument to make than any of the arguments in favor of Python.

          Which I'm ok with because I think "one way to do it" limits creativity and hurts the future of software.

      • ... Initially it is annoying, but you do get used to it within a few hours. ... --- Then the next day, you have to get used to it all over again. Eventually, you do get used to it, but I never stopped wondering ... why, just why, is it considered A Good Idea?
      • I agree, although I will note that nobody complains about Haskell's offside rule.

        I think that the reason why people complain about it in Python and not in Haskell... well, there are several reasons, and one of them (tabs vs spaces) is stupid and belongs in the 90s.

        Apart from the difficulty of refactoring, I expect that one of the big problems is that a lot of programmers coming to Python see indentation and assume it's a block-structured language like C or Perl, then get a rude shock when they find that it

    • The "designate a block of code by the amount of indentation" aspect is a pain to get used to.

      There's a great workaround for that called Perl. :-)

    • The "designate a block of code by the amount of indentation" aspect is a pain to get used to.

      It is easy to get used to, but will never stop feeling stupid, because IT IS stupid ;)

  • You're saying the Ubuntu instructions wouldn't work on Debian?

  • I was looking for "ready to use" material for introducing Python to 8th-grade students. This wasn't it. It is a good core that could be used to build a course, but I suspect that I would need at least a full summer break, if not more, to turn it into a real course.

    I used the Microsoft course in the past. It worked reasonably well. I did it as a"watch them do in the video, then watch me do, then go as a whole class. That worked well, but I had a lot of students transfer out right away, as they found it to
    • The culling is working pretty well, though not well enough. I would estimate that the rate of "programmers" existing in the general population is about 1 in 100,000 (0.001%), which is about 1,000 times the rate of the existance of other "it just works" specialists (mathematicians, musical composers, etc). So if you are a "good teacher" you might expect to come across *ONE* programmer in your entire lifetime. If you are lucky!

      While any asshole can learn to code (and many do) very few are skilled at it. A

      • by Jzanu ( 668651 )
        Retired ICS it is your attitude or rather the type that your attitude represents that is the main problem here. It isn't a zero-sum career war. Programming is a literacy now, and there is no need to "filter" or find some arbitrarily defined ~true~ programmers in a course intended to to develop computational literacy. There is a reason abstract algebra is not taught rigorously in primary school, and instead things like basic numeracy and counting are emphasized first. Now it is the information age but that i
        • "There is a reason abstract algebra is not taught rigorously in primary school, and instead things like basic numeracy and counting are emphasized first."

          As Homer Simpson would say, "Doh!". Before you can understand abstract algebra one must understand basic numeracy and counting. You would find it very difficult to teach the solving of linear algebra equations to someone who does not know what a number is, and what a variable is, even if they were 80 years old.

          "Now it is the information age but that is n

      • Guess you're lamenting the fact that you're in the 99,999, eh?

    • Its a tricky problem - the best teaching language may not be the best production language. Python has a lot of "magic" things that are handled out of sight. There is some advantage to a simple language (some modern equivalent of BASIC) that lets students have a better feel for what the computer is actually doing.

      It might also work to teach a limited version of Python.

      • As one with only e very basic understanding of programming, and non of teaching, Iâ(TM)m just wandering what is wrong with teaching c#, at a beginner level it is not tgst gard to read, it is strongly typed, so the compiler catches a quite a few errors that would create hard to catch runtime errors in weakly Typed languages, you have free ides on most oses (esp windows with vs community edition). C# allso seams to be a language in demand. And it is rather versatile spanning from console apps to desktop
      • by Hasaf ( 3744357 )
        I have been thinking about some implementation along this line. I would like to find some form of basic that is well supported by training material.

        One thing I am NOT to teach is Java (no real tears over that one). The reason is that the High School teaches it (it is still part of the AP curriculum). They want to start from zero.

        Said in the light of the comment above, on the probability of a great programmer, I am not a great programmer. It just doesn't draw my interest. So, yes, it needs to be someth
  • CentOS, Red Hat Enterprise Linux, Ubuntu, and Windows, but not Debian, or TempleOS.

"The vast majority of successful major crimes against property are perpetrated by individuals abusing positions of trust." -- Lawrence Dalzell

Working...