Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Java Oracle

Oracle Calls Java Serialization 'A Horrible Mistake', Plans to Dump It (infoworld.com) 198

An anonymous reader quotes InfoWorld: Oracle plans to drop from Java its serialization feature that has been a thorn in the side when it comes to security. Also known as Java object serialization, the feature is used for encoding objects into streams of bytes... Removing serialization is a long-term goal and is part of Project Amber, which is focused on productivity-oriented Java language features, says Mark Reinhold, chief architect of the Java platform group at Oracle.

To replace the current serialization technology, a small serialization framework would be placed in the platform once records, the Java version of data classes, are supported. The framework could support a graph of records, and developers could plug in a serialization engine of their choice, supporting formats such as JSON or XML, enabling serialization of records in a safe way. But Reinhold cannot yet say which release of Java will have the records capability. Serialization was a "horrible mistake" made in 1997, Reinhold says. He estimates that at least a third -- maybe even half -- of Java vulnerabilities have involved serialization. Serialization overall is brittle but holds the appeal of being easy to use in simple use cases, Reinhold says.

This discussion has been archived. No new comments can be posted.

Oracle Calls Java Serialization 'A Horrible Mistake', Plans to Dump It

Comments Filter:
  • by gweihir ( 88907 ) on Saturday May 26, 2018 @02:40PM (#56679396)

    But the Java fanatics just put in more and more features, regardless of whether sane languages had them or not.

    • Re: (Score:3, Informative)

      by Anonymous Coward

      But the Java fanatics just put in more and more features, regardless of whether sane languages had them or not.

      Obvious?

      Well, given the abstraction from actual hardware that is Java's goal, how would you create a way to pass data from machine to machine without worrying about things like word size and endianness?

      Got any objective reasons? Because what you've posted is just an opinion. And just like that other thing everyone else has, frankly it stinks.

    • Oh please. This wasn't a failure with their implementation. It's an issue with the concept which is still a good thing because the positives still outweigh the negatives.

      It just sucks though going through a 100+ projects to add jaxb to their pom files to prepare for Java 11 LTS that's coming in September.

      • by gweihir ( 88907 )

        The good is minuscule, the bad is massive. And that was obvious back then. We did joke "Now Java even supports malicious mobile code!" when the feature was announced and wondered how this could ever be secured. Of course, many people though it was great because they did not get it. Just as many people today.

    • This has nothing to do with "Java fanatics".

      Java has serialization because it has RMI. Java has TMI because it was designed by Sun in the early 90s when everyone except Sun had already realised that sunrpc was a bad idea. The network is the computer, right?

  • What is this thing called a "record"? What possible use is it? Or could it be a new name for an old concept that might have existed in some ancient cruddy languages. Cobol anyone?
    • by hazem ( 472289 )

      Cobol anyone?

      I thought I was going to old-school school people by mentioning QBasic's "type" structures, but you punked me with Cobol.

      But then again, not even Python does this well if you need a structure with specific data types to match a binary stream you need to read/write reading/writing.

    • A record is basically what in C you would call a struct. The reason why Java desperately needs them is that there is currently no way to efficiently store an "array of structs". Yes, you could do SoA, but that isn't what you want some of the time.

      The inability to control memory layout more finely is the main thing that people trying to write high-performance Java complain about. This will help, at least a bit.

  • by Gravis Zero ( 934156 ) on Saturday May 26, 2018 @03:28PM (#56679544)

    Regardless of language, object serialization is a dangerous idea. While it may seem like a nice idea at first, loading objects from unverified mutable data is an invitation for someone to tinker with that data. The situation only gets worse when your object structure changes because now your object data is invalid or incomplete.

    Much like goto, I'm not arguing that it's not useful but rather that it's use it is inherently dangerous.

    • > Much like goto

      Or worse, eval().

    • by goose-incarnated ( 1145029 ) on Saturday May 26, 2018 @03:38PM (#56679576) Journal

      Regardless of language, object serialization is a dangerous idea. While it may seem like a nice idea at first, loading objects from unverified mutable data is an invitation for someone to tinker with that data.

      Okay then, smartypants, what do you propose for persisting fields of an object? Anything you propose is, by definition, "serialisation". The only alternative to serialisation is non-persistent objects.

      (TBH, I kinda like the thought of signed serialiased blobs)

      • Put it in the cloud, of course!

      • Re: (Score:3, Insightful)

        by Gravis Zero ( 934156 )

        Regardless of language, object serialization is a dangerous idea.

        Okay then, smartypants, what do you propose for persisting fields of an object?

        I was speaking specifically about object serialization. There's nothing wrong with data serialization but using it for object serialization is asking for trouble. If you don't understand the difference then you should excuse yourself.

        • Gravis Zero: Oh I guess all those guys making object databases are damned fools and you know better than them. We're not worthy, we're not worthy.

      • ok, help me out here. If you save an object (or any data) to a file, as long as you validate the data when you open the file and load it..... what's the problem?

        Because we have "persistent objects", they're called files.

        signed serialized blobs

        We call those a CRC bits, or checksums. Usually there's one per record and/or one for the whole file / stream / whatnot.

        Also, those round things are called "wheels", there's really no need to re-invent them. And PLEASE don't try patenting them, that has the potential for a big headache fo

        • The issue is that you're also persisting the methods, not just the fields.
          • by jb_nizet ( 98713 )

            No, you don't. But, you allow a hacker to modify the persisted bytes and thus make the production code load objects that have a state that they should never, ever have, breaking their invariants, and possibly make them call constructors of classes that they should never call.

    • by HiThere ( 15173 )

      Serialization is quite important, however. My preference is that it contain some "signing bytes" to identify what it is, including version number, and a checksum. This still doesn't protect against hostile action, of course, but is more for detecting that you can handle the version and you know what it is you're deserializing. It might also identify the word-length and byte order, to make it more portable, but in my typical use case having it match my native processor is more important than portability.

  • by angel'o'sphere ( 80593 ) <angelo.schneider ... e ['oom' in gap]> on Saturday May 26, 2018 @04:06PM (#56679700) Journal

    Why would serialization be a security risk?

    Hu? Cant ... you write to a disk or to a socket and thats it.

    Sure, I'm nitpicking, because deserialization might be a security risk.

    However only if you actually do it and e.g. leave open paths how bad files can end on your disk, which you then read, or open a socket and accept incoming serialized objects.

    A typical Java program is absolutely not vulnerable to anything regarding serialization unless the programmer (intentionally?) made it so.

    Articles about this (and basically every post here in the story while I type this): are simply wrong.

    Java Serialization was once its strongest point of success. Many GUI builders let you edit "beans" and simply serialize the GUI as an graph of objects that simply gets read in again when the application starts and you call the setVisible(true) method to show your window.

    Not needing to write any boilerplate code for writing and reading objects is a huge time saver and simplification.

    • Serialization of data only is not a risk. Serialization of objects that can contain references to other objects is the problem. Because an attacker can tamper with the raw binary object that can still be deserialized, but now has different contents and now will run differently on the other end, in a manner not expected or possibly controlled.

      Basically, serializing anything that can be acted upon is dangerous. It is like sending you a package and then immediately telling you to pick the first thing out of th

      • You could make the same argument about using Hibernate and constructing objects from a relational database. You could use the same argument about basically any program that does IO. It's a damned stupid argument. Whether and how much validation you need to do on your IO is application specific. If your application needs to validate the entire state of the object graph before doing anything with it, then you need to do that. It's not a problem with the concept of deserialisation.

      • Because an attacker can tamper with the raw binary object that can still be deserialized, but now has different contents and now will run differently on the other end, in a manner not expected or possibly controlled.
        Yeah, and he can use an SQL statement to change a row in the data base ... or a PERL script to change a line in a text file ... what exactly is the difference?
        And it has nothing to do with graphs anyway. It can be a single object, only consisting out of primitive types.

        Hint: the problem is code,

  • OK I don't get it. Serialization is just saving the field values to a file and then reading it back. Of course if you just read a file without any validation and you don't know if it has been tampered with then of course you can have security issues. But this applies to any file formats or data anywhere. Java serialization is not a unique case. Serialization is an easy way to load values without going through an intermediary format. Replacing it with JSON or XML doesn't change the issue one bit.

    • Re:I don't get it (Score:5, Informative)

      by angel'o'sphere ( 80593 ) <angelo.schneider ... e ['oom' in gap]> on Saturday May 26, 2018 @08:53PM (#56680838) Journal

      Java is in so far unique as when you use build in serialization, you also serialize the class files.
      There are two "marker interfaces" to make a Java class serializable: Serializable and Externalizable.

      In casse of the first one, the Java Framework/VM uses reflection to serialize and deserialize objects.
      In case of the second one, you are required to implement the methods writeExternal() and readExternal().

      As the class files are in the serialized data stream, a program reading "untrusted" serialized data might also load classes aka code from that stream. If that code implements Externalizable and thus has an "unknonwn foreign" method readExternal(), the deserialization framework will call that unknown/untrusted method readExternal() which means: you run code coming from outside, which can do what ever it wants besides reading the object from the object stream.

      • by kbg ( 241421 )

        If you reading the code also from an untrusted stream then yes of course you will have security issues. But that is a completely separate matter. You don't ever load code from an untrusted source.

        I don't see anywhere in the Java specifications that code is also read in when dezerializing can you point me to that spec?

        • You are right, normal serialization to files does not include the code, only via RMI the code is included (or requested be the recipient) when the recipient does not have the classes on the classpath.

    • Not just "field values" but executable code!

  • by Wookie Monster ( 605020 ) on Saturday May 26, 2018 @04:24PM (#56679778)
    I'm concerned that someone might hear "object serialization is bad, but JSON is good" and make the same mistakes that were made with Java object serialization. Java object serialization is bad for the following reasons:

    1. No validation. You might have a nicely designed object, well tested, and has all sorts of validation checks to ensure that the internal state is never broken. Java object serialization bypasses all validation, permitting an attacker to construct a malformed object. Exactly how that would cause a problem requires a bit more work on the attacker's part, by studying how the application reacts to the malformed object. Adding validation is supported with Java serialization, but its not used by default. The designers favored simplicity over safety. Does switching to JSON magically fix the validation problem? Nope.

    2. Loading of classes that you didn't expect to load. If I expect to receive a serialized list of strings, there's nothing to prevent an attacker to providing a list of any kind of object instead, due to type erasure. The application might fail to process the list because of a ClassCastException, but the potential damage is done. Java serialization /does/ support filtering out classes that aren't expected, but this is off by default. You need to define the blacklist yourself. Why is loading other classes a problem? See the next reasons:

    3. Custom code during deserialization, which is actually necessary for performing your own validation. You can define your own code which runs when the object is deserialized, and the code can do pretty much anything. An attacker might be able to trick the code (using malformed input) into doing something harmful.

    4. Additional classes on the classpath. Even if all of your code is well behaved, and has proper validation checks, and proper custom code, you're still vulnerable because additional classes exist that you're not aware of. You had no idea that there's this class 'Q' which has broken custom code, because Q was sucked in as a dependency of something else. That popular open source library you're using might be exposing your application to attack, and you didn't even know it.

    For anyone designing an object serialization mechanism, always consider the tradeoffs when trying to make the system easier to use. Always use whitelists for trusted code instead of blacklists. Always construct objects using the object's public API. Favor the use of standard representations (maps, lists, tuples) instead of supporting full-blown customization. A little bit of friction can be a good thing.

    • Re: (Score:2, Insightful)

      by Anonymous Coward

      To answer your points with the obvious:

      1) Use the validation supported by java, just like you would in XML, JSON, . Problem solved. Serialization isn't the issue here, the app dev is. The app dev can be lazy on XML or any other serialization classes.

      2) Same as point 1. The facility is there, use it. 'off by default' isn't an excuse for it being 'bad'

      3) Unit tests and write proper code. Again, this problem isn't different to any other XML/JSON serialisation mechanism.

      4) You have the same issue with any XM

      • by Dog-Cow ( 21281 )

        Unit tests will never catch the cases you didn't think of. Attackers are going to exploit the cases you didn't think of. Ergo, unit tests cannot ensure security.

    • by Jeremi ( 14640 ) on Saturday May 26, 2018 @05:25PM (#56679982) Homepage

      If I'm following you correctly, the problem isn't serialization per se but rather the fact that the deserialization is being done by the Java runtime (which has no way to validate the resulting objects against the application's requirements, since its deserialization code is application-independent, and also has the power to instantiate any kind of object, even those that are totally irrelevant to the task at hand), rather than by the application itself.

      A user-supplied deserialization-routine, OTOH, has at least a chance of being secure in the face of invalid source data, since it can check to make sure that its constraints are correctly satisfied and reject the data if they aren't.

      Of course, avoiding making every application developer write his own application-specific serialization/deserialization routines was largely the point of this Java feature, but in hindsight it appears that was a bad decision.

      • Of course, avoiding making every application developer write his own application-specific serialization/deserialization routines was largely the point of this Java feature, but in hindsight it appears that was a bad decision.

        And this decision is just further evidence of Oracle's incompetence. Instead of keeping it but requiring every application developer to write his own object verifier, they're simply removing it because doing it right is hard.

      • by Anonymous Coward on Saturday May 26, 2018 @06:05PM (#56680164)

        If I'm following you correctly, the problem isn't serialization per se but rather the fact that the deserialization is being done by the Java runtime (which has no way to validate the resulting objects against the application's requirements, since its deserialization code is application-independent, and also has the power to instantiate any kind of object, even those that are totally irrelevant to the task at hand), rather than by the application itself.

        Java deserialization is magic. By which I mean it behaves in several ways that user code pretty much can't.

        The default system effectively loads a binary blob off the input stream and then creates each object without calling a constructor*. You can't just not call a constructor in Java, but Java deserialization does. All the fields are set by magic, by which I mean it ignores getters and setters and whatever access level might be on the fields. Any field marked as "not serialized" (transient) is left with default values - but those may not be the default values you think! If you write private transient int foo = 3; then foo won't be serialized, and when the object is deserialized, it will instead be ... 0. Because 0 is the default for ints.

        How does Java deserialization know if it's loading the right fields for a given object? Well, it's magic, but not that magic - you're supposed to let it know by setting the serialization ID for the class. And how do you do that? By declaring a static long serialVersionUID, and making sure you update it whenever your class structure changes. Don't do that and the deserialization logic might not notice that the structure doesn't quite match. No, you can't just have it autogenerate one - if not set, the serialization/deserialization code will create one, but it may be dependent on compiler and randomly break across identical code bases. Surprise!

        But in any case, the serialization system is magic. How do you write a custom serializer/deserializer? By creating the private methods writeObject(ObjectOutputStream) and readObject(ObjectOutputStream). Because the serializer is magic, it can access these private methods. (Note that readObject(ObjectOutputStream) gets called on a magically created object that has never had a constructor called on it, so all fields will have their default values! How does that work with final fields? Well... the short answer is "like shit." The longer answer is that the default deserializer just ignores the final modifier (which you can't do in generic code), and that if you want to do the same, there's some reflection magic or non-standard APIs you can do.)

        So anyway, there's a basic overview of how Java serialization defies expectations and basically guarantees that anyone writing code that involves serialization will do it wrong.

        * This is false. What it really does is go up the object hierarchy and look for the first parent class that does not declare itself serializable and calls its default no-args constructor. But that means that your class that you declared serializable therefore, by definition, does not get its constructor called. Surprise!

        • But that means that your class that you declared serializable therefore, by definition, does not get its constructor called. Surprise!

          No surprise. Calling a constructor when you deserialize an object makes no sense. That is why Java rightfully does not do that.

      • Of course, avoiding making every application developer write his own application-specific serialization/deserialization routines was largely the point of this Java feature, but in hindsight it appears that was a bad decision.

        Yeah. It seems like there's no really good way to make this feature work. Whitelists can help, but ultimately there is no way to avoid thinking about security when you read things off the wire.

      • by Dog-Cow ( 21281 )

        What I've gotten is that Java allows the actual byte-code of the class implementation to be included in the serialized data. This allows code to be injected, and not just malformed data. I would think the fix would be to just remove the part of the JVM which creates new classes from serialized data.

  • by pestilence669 ( 823950 ) on Saturday May 26, 2018 @06:56PM (#56680366)
    Serialization isn't inherently bad. It's bad practices and misuse, which won't change. It'll just be replaced by many developers with XML, JSON, Protobuf, YAML, or other. Then, someone will inevitably sprinkle on some reflection or code generation, and you've almost done a 360... but with a lot more code and even more that could go wrong. I don't agree that adding more training wheels and/or removing features is always the best way to fix bad developer habits.
  • In some way, Java started as a toy language and headed downhill since. Multiple inheritance and deterministic object destruction are hard but useful. Java never had those, but it had an option to have full featured, grown up applications on desktop and in the web browser. Of course native look and feel of the former and security of the later is hard. So - out these features go. Couldn't make J2ME on mobile phones work either, so took another company to productize Java for mobile apps. Instead of appreciatin

  • by account_deleted ( 4530225 ) on Sunday May 27, 2018 @03:14AM (#56681814)
    Comment removed based on user account deletion
  • Object Serialization was horrible from day one. It was the tool of lazy programmers for years.

    Performance wise it was a disaster. People would pass objects between jvm instances with no regard to the size of the data blob they were sending. When in reality very little of an object is required to be sent in most use cases. Example: When you have a java cluster and you sync session objects between instances. ( Note this is just dumb anyway. There are far better patterns for this. ) But your low cost dev

The last person that quit or was fired will be held responsible for everything that goes wrong -- until the next person quits or is fired.

Working...