Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Python Programming

Python 3.4 Released 196

New submitter gadfium writes: "Python 3.4 has been released. It adds new library features, bug fixes, and security improvements. It includes: at standardized implementation of enumeration types, a statistics module, improvements to object finalization, a more secure and interchangeable hash algorithm for strings and binary data, asynchronous I/O support, and an installer for the pip package manager."
This discussion has been archived. No new comments can be posted.

Python 3.4 Released

Comments Filter:
  • by LifesABeach ( 234436 ) on Tuesday March 18, 2014 @10:05PM (#46521583) Homepage
    That would be nice.
    • by wjcofkc ( 964165 )
      I am a little confused about your request. On my very modest system, Python takes just under three-minutes to compile from: extract > cd > ./configure > make > make install

      I run several Ubuntu derivatives and honestly never considered apt-get - but I also often run more than one version of Python on any given system and compiling manually makes that easier to maintain. If you are a Linux user so stuck on apt-get that you cannot work with source code at all, I highly suggest you download the sou
    • by wjcofkc ( 964165 )
      In fact I will even get you started:
      cd ~/Downloads
      tar -zxvf Python-3.4.0.tgz
      cd Python-3.4.0
      ./configure
      make
      sudo make install

      This harmless method will only install python in the directory you built it in. So if you type "python" you will still get the old interpreter. If you type ./python you will get 3.4 - As far as replacing your existing installation completely or doing something more complicated, I will leave it to you to Google that so I don't lead you down an irreversible path you did not inte
  • and... (Score:5, Insightful)

    by Anonymous Coward on Tuesday March 18, 2014 @10:16PM (#46521627)

    And everyone will keep using 2.6/2.7, the windows XP of python.

    • by jasonla ( 211640 )
      And in about 20 years, it will make it into the REHL derivative my company uses... sigh.
      • In fairness Python 3 isn't really as widespread as it should be. I think people have found the 2.7 branch just works well for them.

        With that said I do wish people WOULD move to python 3. 2.7's unicode handling is infinitely awful and fragile compared to 3.

        • Paraphrasing other Slashdot posts I have seen, there are no compelling reasons to upgrade to Python 3. Removing the global interpreter lock would be one major reason to, but no one has submitted a good patch for that, and besides, someone would probably just backport it to Python 2.7.

          • Re: and... (Score:5, Insightful)

            by dmbasso ( 1052166 ) on Tuesday March 18, 2014 @11:26PM (#46521929)

            There are plenty of good reasons to use Python 3, it is way more elegant and consistent. The way text and binary data is dealt with is incomparably better. I doubt that anyone who ever had done any serious coding in Python 2 escaped from the mindfuckery of mixing unicode and ascii.

            The problem for a wider acceptance continues to be the libraries... for instance, Twisted. It is good that there is an async module in the standard library now, but too bad that my code already relies heavily on Twisted.

            And about the GIL: if you are complaining about it, you most probably are not using the right language for the job.

            • by Kremmy ( 793693 )
              The way I've heard it the manner in which Python 3 has modified the Python Standard Library has made it so cases where you aren't working with pure Unicode data (such as in any real world problem) get all the hassle and more of Python 2. Interoperability with foreign systems is kind of a basic foundation of data processing, having to workaround inconsistencies with the Python 3 Standard Library to do so probably means it's no longer the right tool for the job.
              • Re: and... (Score:4, Informative)

                by spitzak ( 4019 ) on Tuesday March 18, 2014 @11:59PM (#46522055) Homepage

                This exactly.

                If your UTF-8 string is not completely valid, Python 3 barfs in useless and unpredictable ways. This is not a problem with Python 2.x.

                Until they fix the string so that an arbitrary sequence of bytes can be put into it and pulled out *UNCHANGED* without it throwing an exception then it cannot be used for any serious work. Bonus points if this is actually efficient (ie it is done by storing the bytes with a block copy).

                Furthermore it would help if "\xNN" produced raw byte values rather than the UTF-8 encoding of "\u00NN" which I can get by typing (gasp!) "\u00NN".

                • Re: and... (Score:5, Informative)

                  by Anonymous Coward on Wednesday March 19, 2014 @01:19AM (#46522289)

                  This is why there's a bytes type.

                  If what you have is not text, don't use the text type.

                  • by spitzak ( 4019 )

                    No, all that means is that EVERYTHING has to be changed to use the bytes type.

                    I mean every single library function that takes a unicode string, every use of ParseTuple that translates to a string, etc. Pretty much the entire Python library must be rewritten, or a wrapper added around every function that takes a string argument.

                    Everybody saying that "it's good to catch the error earlier" obviously has ZERO experience programming. Let's see, would it be a good idea if attempting to read a text file failed if

                • by godefroi ( 52421 )

                  If it's not UTF-8, why do you claim it's UTF-8?

                  That's like arguing that XML parsers should allow unclosed tags, because otherwise, they just throw exceptions and can't be used for serious work.

                  You're probably the guy we have to thank for "tag soup". Asshole.

                  • What should a well-behaved program do with bytes objects pulled from a database that already contains plenty of invalid encoding?
                    • by spitzak ( 4019 )

                      The program should produce an error AT THE MOMENT IT TRIES TO EXTRACT A Unicode CODE POINT. Not before, and not after.

                      If the program reds the invalid string from one file and does not check it and writes it to another file, I expect, and REQUIRE, that the invalid byte sequence be written to the new file. It should not be considered any more of a problem than the fact that programs don't fix spelling mistakes when copying strings from one place to another.

                    • by tepples ( 727027 )

                      The program should produce an error AT THE MOMENT IT TRIES TO EXTRACT A Unicode CODE POINT. Not before, and not after.

                      Which leaves open the question of how best to clean up all the tens of thousands of existing records that may not be valid UTF-8. Otherwise: "This product is unavailable for purchase because its description contains invalid data. This problem has been reported to the store owner."

                    • by spitzak ( 4019 )

                      Well the first thing you need to do to clean up the invalid UTF-8, for instance in filenames, is to detect it.

                      If reading the filename causes it to immediatly throw an exception and dispose of the filename, I think we have a problem. Right now you cannot do this in Python unless you declare it "bytes" and give up on actually looking at the Unicode in the vast majority of filenames that *are* correct.

                      It is also necessary to pass the incorrect filename to the rename() function, along with the correction. That

                    • by godefroi ( 52421 )

                      Hey, I figured out what your problem is, where you went wrong. You think that a string and a bunch of bytes are the same thing. They're not. If you have a bunch of bytes, treat it as a bunch of bytes. If you have a string, treat it as a string.

                      Java, for example, stores strings internally as UTF-16 (or UCS-2, opinions differ). .NET stores them internally as UCS-2.

                      This is also why there's a difference between CHAR and NCHAR in databases.

                      There is not a one-to-one mapping from a given string to a given set of b

                    • by godefroi ( 52421 )

                      Pretty much every string operation is going to require decoding. Things like substr(), replace(), split(), join(), etc are all going to require decoding the string.

                    • by spitzak ( 4019 )

                      Aha! Somebody who really does not have a clue.

                      No, substr() does not require decoding, because offsets can be in code units.

                      No, replace() does not require decoding, because pattern matching does not require decoding, since UTF-8 is self-synchronizing.

                      No split() does not require decoding because offsets can be in code units

                      No, join() does not require decoding (and in fact I cannot think of any reason you would think it does, at least the above have beginning-programmer mistakes/assumptions).

                    • by spitzak ( 4019 )

                      Stupid software that thinks it has to convert to UTF-16 is about 95% of the problem.

                      UTF-16 cannot losslessly store invalid UTF-8. It also cannot losslessly store an odd subset of arrangements of Unicode code points (it can't store a low surrogate followed by a high surrogate, because this pattern is reserved to mean a non-BMP code point). It also forces a weird cutoff at 0x10FFFF which a lot of programmers get wrong (either using 0x1FFFF or 0x1FFFFF). UTF-16 is also variable sized and has invalid sequences,

                    • by godefroi ( 52421 )

                      Maybe you should design your own platform where strings will be represented internally as UTF-8. It would be an interesting exercise.

                    • by godefroi ( 52421 )

                      Well, yeah, but that would completely change the way these things work. What if your split() worked on code units, and you broke up a code point? That certainly wouldn't produce results that anyone would consider optimal, or even useful.

                      You can continue to pretend that byte arrays are strings, and strings are byte arrays, but you're not going to get anywhere. The rest of the world decided that we want a useful abstraction over the underlying data structure. When we're working with strings, we care about cha

                  • by spitzak ( 4019 )

                    The text is 99.9999999% UTF-8.

                    What I want to do is gracefully handle tiny mistakes in the UTF-8 without having to rewrite every function and every library function it calls to take a "bytes" instead of a "string", and thus completely abandon useful Unicode handling!

                    Come on, it is blindingly obvious why this is needed, and I cannot figure out why people like you seem to think that physically possible arrangements of bytes will not appear in files. The fact that all serious software cannot use Unicode and has

                    • by godefroi ( 52421 )

                      There's your problem right there. There are no "tiny mistakes" in UTF-8. Either it's valid UTF-8, or it's not. It's valid XML, or it's not. It's valid JSON, or it's not. It's valid HL7, or it's not. There is no "graceful" handling of invalid data, not in the general case.

                      Physically possible arrangements of bytes will appear in files, yes, but those files are not necessarily UTF-8.

                      Oh, and all *my* serious software can handle Unicode just fine (in all its various encodings), because I use a platform that was

                  • by spitzak ( 4019 )

                    I'm arguing against a design that is the equivalent of saying "you can't run cp on this file because it contains invalid XML".

                    There is nothing wrong with the xml interpreter throwing an error AT THE MOMENT YOU TRY TO READ DATA FROM THE STRING.

                    There is a serious problem that just saying "this buffer is XML" causes an immediate crash if you put non-xml into it.

                • by Jmc23 ( 2353706 )
                  ... and by serious work you mean compromising systems right?
                  • by spitzak ( 4019 )

                    God damn you people are stupid.

                    I am trying to PREVENT denial of service bugs. If a program throws an unexpected exception on a byte sequence that it is doing nothing with except reading into a buffer, then it is a denial of service. If you really thing that invalid UTF-8 can lead to an exploit you seem to completely misunderstand how things work. All decoders throw errors when they decode UTF-8, including for overlong sequences and ll other such bugs. So any code looking at the unicode code points will stil

              • Re: and... (Score:5, Informative)

                by Anonymous Coward on Wednesday March 19, 2014 @01:25AM (#46522307)

                The inconsistencies are fully within Python 2. My experience is closer to full-scale horror when having to consider different encodings in Python 2, and since I am from a country that actually needs these "bells and whistles" regarding encoding for regular I/O on a regular basis, I have met these issues many times. Using chains of codecs to read and write files, having to intercept exceptions and .encode() .decode() in differing combinations to be able to avoid Python 2 "double-crashing" when reporting an exception, deep level hacking to reinitialize sys.stdout before output on certain machines, etc.

                In Python 3, it does not "just work", but that is because character encoding is never a "just works" problem, and languages that say it is fail miserably in this regard as soon as it meets real world international encodings. Python 3 defines the problem correctly, and solves it natively in the best way I can imagine, by always being aware of the problem. No more prepending the u qualifier to every single string that might or might not be output (or combined with any other string that might or might not be output). Python 3 solves it correctly, by acknowledging character encoding as something that is actually an issue, and it does not make the silly assumption that ASCII is the way of the world. This assumption has been silly for at least 40 years, but many products were developed in ASCII centric regions, or at least in regions where you seldom saw more than one encoding, and never fully addressed the problem.

                The Python 3 standard library does strings right , and should get credit for it. Instead it gets flac from programmers who do not like that it does not inherit the quirks from Python 2 that we have become accustomed to (and are still miles better than in many other languages; PHP and unicode, anyone?).

                Heck, the number one reason that I have converted as many projects as I can to Python 3 is because of the blocks of encoding centered Python 2 code I can just throw out the window, and ease future maintenance. There are still some big module holdouts, but that was a much larger problem in ~2010. Today, the ones I miss in Python 3 are e.g. WXPython (where work is ongoing in the Phoenix project) and MySQLdb (the MySQL connection alternatives for Python 3 are outright silly -- either non-functional or non-documented).

                There are several introductory programming courses I know of that focuses on Python, and they all use Python 3 by default. I am sincerely looking forward to the day when Python 3 is the natural order.

                It takes a lot of motivation to change language structures from Python 2, and those working on the drafts are certainly top-class in their fields, so if one finds any design changes weird, the first instinct should be to read up on the rationales for the decisions. I have yet to encounter a change that seems "silly" or unnecessary after reading about the process.

                Also, for the early adopters, not that Python 3.3 (och 3.4, as this article is about) is not 3.0 or 3.1. There is a lot of things that have been fixed along the way.

                • In Python 3, it does not "just work", but that is because character encoding is never a "just works" problem, and languages that say it is fail miserably in this regard as soon as it meets real world international encodings.

                  What's wrong with simply presenting everything with Unicode? It might not be the most efficient possible way of representing text, but that's unlikely to matter much, given that you're using Python.

            • by jythie ( 914043 )
              this kinda highlights why a lot of people are still on Python 2.x. Python 3.x kinda comes across as a language fetish rather then something pragmatic, incompatible changes for the sake of sexyness. Elegance and consistency are great when you are waxing poetic, but are off less importance when you are interested in a language as just a tool. A lot of the library authors fall into that later category, it is a tool to get a job done and Python3 does not really prioritize pragmatism.
        • by EmperorOfCanada ( 1332175 ) on Wednesday March 19, 2014 @12:47AM (#46522213)
          I have recently started bathing in the waters of Python. What I have realized is that it is a core group within Python who are rightfully proud of their 3.x accomplishment. But they are solidly ignoring the fact that only a tiny percentage of people are using it. The reasons are quite simple people will need 8 modules for their system and 1 barely works with 3.x and the other says something like "mostly works" Well most people aren't willing to depend upon "mostly".

          Now module after module is going 3.x but the other problem is that for most people having two pythons on their machine is a pain in the ass. I know there are tools to make this less painful but I can tell you an easy way to make it painless, Don't have two versions.

          Then there is this call that you should begin new projects in 3.x; but the problem again is the two versions issue.

          What bothers me about all this is that I come from a C++ / PHP world. With C++ I have upgraded countless times over many years and had close to zero problems with my code. I don't even know which compiler XCode is even using right now. With PHP my various upgrades have broken exactly one module and I hear rumours that the next big version of PHP will break one module in my older code. But I don't care as I am replacing my PHP with Python.

          Where I am worried is that the core Python people will do something stupid like announce an end of support date for 2.7. The problem there is that it might be easier for some people to install a whole different language to sit alongside Python 2.7 and start playing with that instead of smashing their machine in the teeth and simultaneously installing 3.x.
        • The problem is that 2.7's unicode is a hack that doesn't play nice with legacy code or 2to3. If you have any legacy code or third party library that expects string to behave like a bytes object, 2to3 will turn them into incompatible unicode strings. If you import unicode_strings from __future__, you get a monkeypatched string class which will cause problems with the existing 2.x code that expects string to behave like a bytes object.

          The biggest barrier to the Python 3 transition has been the lack of support

    • by gweihir ( 88907 )

      There are very few reasons to stick with the old model. Sure, it takes a bit to get used to some of the changes, but it is not that hard. And most good libraries have already moved over or are compatible with both.

    • by gnupun ( 752725 )
      Python 3 // division operator breaks division polymorphism:

      Let's do integer division first.

      Python 2:
      >>> 20 / 2 # int divide int -> int
      10

      Python 3:
      >>> 20 // 2 # int divide int -> int
      10

      >>> 20 / 2 # int divide int -> float (wtf?)
      10.0

      Now let's do floating point division.

      Python 2:
      >>> 2.5 / 5.0 # float divide float -> float
      0.5

      Python 3:
      >>> 2.5 // 5.0 # float intdivide float -> rounded float
      0.0

      # You have to use "/" for floating-pt division for

      • I take issue with your "musts". The way I see it, if I want a real answer to x/y I use x/y. If I want some special meaning of / (like integer division or rounded answers or blah blah), I use //. That is good design.

      • If a, b and c can be int or float, how can a programmer implement "a = b divide c" without a lot of ugly type checking?

        By determining what type you want in the result and choosing the operator that produces that type. And you can get Python 3 division behavior in Python 2.6 or 2.7 using from __future__ import division.

        • by gnupun ( 752725 )

          By determining what type you want in the result and choosing the operator that produces that type.

          But that's the point, sometimes the programmer implementing "a = b div c" does not know whether the result "a" is int or float because he's writing a general library function where depending on the application "a" may be int or it be float. In python 2, he does not have to know and write "a = b / c". But that won't work in Python 3 without a lot of ugly type checking.

          From reading the PEP for the "// oper

          • sometimes the programmer implementing "a = b div c" does not know whether the result "a" is int or float because he's writing a general library function

            If the output of the general library function shall be a float, then use /. If the output of the general library function shall be an integer, then use //. If the output of the general function shall depend on the types of the arguments, then I'm having trouble understanding in what sort of case this would prove useful.

            where depending on the application "a" may be int or it be float.

            On what aspect of the application would it depend? As far as I can tell, whether a should be int or float depends on what kind of quantity b is intended to represent and what kind of quantity

            • by gnupun ( 752725 )

              In what specific "general library function" would "use floor division if neither b nor c is a float; otherwise, use true division" be helpful? It would prove easier for me to see your point if you can give a concrete example.

              Let's take this example: Write a function that computes the average of a list (or sequence, generally speaking) of numbers. The list may contain elements that are all either integers, floats or complex numbers. The result should have the same type as an individual element of the list

              • def average(seq): return sum(seq) / len(seq)

                Or in Python 3.4 or later: from statistics import mean as average

                Can you implement a simple average() function in Python 3 that satisfies the specifications mentioned above?

                I'm not clear what average([-10, -11]) should return under your specification. Or would it raise a ValueError? In general, the average of integers is not an integer but a rational, because integers are not closed [wikipedia.org] under mean. Mean would have to promote to a type capable of representing (or at least approximating) rationals. Python has no built-in exact rational type, but it does have float.

                • by gnupun ( 752725 )

                  Or in Python 3.4 or later: from statistics import mean as average

                  I don't have 3.4, but it would be better if you could implement average using just the sum() function just so you can experience the split personality of the division operators in python 3.

                  I'm not clear what average([-10, -11]) should return under your specification.

                  As mentioned in the spec, int list should return an int. There's no confusion and no rationals because most computer programs either deal with integers or floating point numbers

                  • As mentioned in the spec, int list should return an int.

                    Please clarify the spec further: If the list contains an int and a float (such as average([10, 12.0])), should the result be returning a float or raising a ValueError? I admit that my process of requirements elicitation [wikipedia.org] sounds like I'm asking a lot of nitpicky questions, but it's necessary to avoid an underspecified function. "What Python used to do prior to the addition of true division in 2.2" is still underspecified if the function is ported to any language other than Python.

                    So your function should return func([-10,-11]) --> (-10-11)/2 = -11 under the rules of integer division.

                    Until now you have left "the

                    • by gnupun ( 752725 )

                      Please clarify the spec further: If the list contains an int and a float (such as average([10, 12.0])), should the result be returning a float or raising a ValueError?

                      No need to care about mixed types. But if you want to be that specific:
                      a) if all elements are ints, result should be an int
                      b) if any element is a float, result should be a float
                      c) if any element is complex, result should be complex.
                      Rule (c) has higher priority over (b), which has higher priority over (a).

                      Until now you have left "the rules of i

                    • If your spec is "do what Java does", then Java integer division rounds toward zero [stackoverflow.com], and Java integer addition wraps within (signed) 32 bits [stackoverflow.com]. This means the naive Python 2 implementation gets it wrong too, and we're back to square one where we're scanning the list for being all ints and using the "act like Java" method if no float or complex types are found in the array.
      • In the real world, 5 / 2 == 2.5. This is true whether the operands are integers or floats.

        In some discrete math systems, 5 / 2 == 2.

        A language has no way to really know what kind of problem you are working on and which calculation would be more appropriate. Python 2 made the assumption that if you fed in two integers, then you were working in a discrete math system. This turns out to not usually be the case, and was a source of surprise and bugs for many people. (Python 2's division was modeled after C, whi

      • You've decided for yourself that integer division must return an integer result even though it isn't always mathematically correct. You don't have to use truncating division unless you need it. On the other hand, unexpected truncation can cause a wide assortment of problems. If you want guaranteed integer division use //. It's been there since 2.2.

        The Python 3 division system is more consistent because you always get the correct result and have the choice to throw away precision when it is unwanted. The tru

  • Ob (Score:5, Funny)

    by Hognoxious ( 631665 ) on Wednesday March 19, 2014 @05:10AM (#46522861) Homepage Journal

    Have they fixed the whitespace bug yet?

If you don't have time to do it right, where are you going to find the time to do it over?

Working...