Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Java Programming

Java EE & Streaming Architectures 40

Amin Ahmad writes "Implementing a streaming architecture on a Java EE application server provides asymptotically better memory performance, and, hence, scalability, than current, widely-implemented, Java EE patterns endorsed by Sun. This article provides a concrete implementation of a streaming architecture and compares its scalability to two other, standard implementations: Remote EJB and Local EJB-based solutions. The implementation based on a streaming architecture comes out the hands-down winner: for example, when sending back 300 rows of data to the client, the Local EJB solution fails beyond 16 concurrent users whereas the streaming solution is still running at 128 concurrent users! The article includes complete source code and the entire results database for the stress test. I would be interested in hearing your feedback."
This discussion has been archived. No new comments can be posted.

Java EE & Streaming Architectures

Comments Filter:
  • Interesting, but... (Score:3, Interesting)

    by VGPowerlord ( 621254 ) on Monday December 11, 2006 @07:42AM (#17193022)
    I see the following problems with the article.

    1. Other than MySQL, it doesn't specify the software in use (it implies Apache Tomcat, but that is not explicitly stated), except...
    2. Microsoft Web Application Stress Tool. Pardon me if I refuse to put any faith into tools by Redmond. Particularly since, if Tomcat was in use, MWAST is being used instead of Apache's own ab [apache.org] tool.
    3. Why wasn't Java 1.5 tested? By definition, Java 1.4 means that you're testing vs. EJB 2.x instead of EJB 3.x. I don't know what changes have been made between the two, as I haven't learned EJB, but I'm assuming there have been some changes between the two, for better or for worse.
    4. What's causing the OutOfMemory errors? If a pair of servers are falling over at 16 simultaneous requests for a 301 row dataset, there's a major problem.

    Just some thoughts.
    • I don't agree with your comment about MWAST. It's not because Microsoft made it that it doesn't work. However I was also wondering about their OOM errors. A president is a 7KB (on average) picture plus a start date, end date, name, and a summary. Consider that this might be an extra 3KB. So 10KB per row, you're returning 300 rows per client, the memory usage is supposedly linear in the number of rows.

      So that's about 47 megs as a lower bound for 16 users (16*10*300). A strict upper bound would be about 750 m
      • Ah, right. See, I just assumed no one would be stupid enough to store the images in the database that could be better served as files from a static webserver. Silly me!
    • by Epesh ( 2854 )

      Another obvious problem is that he's "testing EJBs" by completely ignoring them. He uses Java 1.4, sure, but the bigger problem is that he's not actually using EJBs at all - he's fetching the data directly from the DB in the servlet, and his "local EJB" returns the data immediately while his "remote EJB" serializes it.

      This isn't what EJB does.

      This is a retarded "benchmark." A real benchmark might be useful, but it'd require actually figuring out what kind of EJB implementation would make sense on the b

      • This test is not completely invalid! His "EJBs" will actually outperform real EJBs because they're doing less work. He's just illustrating a point in the interface design: an EJB is supposed to have a coarse-grained interface, so it returns all the presidents as an array in a single call. His benchmark demonstrates how such a design can cause memory problems.

        The real flaw in his benchmark is that he didn't publish the heap size he's using. We run JVMs with 1.7G heaps around here, and I'll bet his benchmark
        • by Epesh ( 2854 )

          The test is invalid. He's actually doing far MORE work with his "EJB" than a "real" EJB application would. I've written a version of his application, using actual EJBs and a decent architecture. The memory usage is nowhere near "worse" with the EJB version, because the container caches the entities. The scalability he illustrates is rubbish.

    • What's causing the OutOfMemory errors? If a pair of servers are falling over at 16 simultaneous requests for a 301 row dataset, there's a major problem.

      They're probably caused by not upping the default for the maximum amount of memory the JVM will use and not using a database connection pool. Considering how inept the rest of the "Streaming Presidents" article is, it wouldn't surprise me if these are the problems.

    • You may also want to try GlassFish. Updated version of the Java EE 5 SDK [sun.com] was released toady. It is free. Sun's Application Server (9.0 Update 1 Patch 1 [sun.com]) based on Project Glassfish [java.net] is included in the SDK. It contains a performance bugfix that enables record-breaking price/performance on the application tier with SPECjAppServer result of 521.42 JOPS@Standard - see Scott's latest blog [java.net] for all the details.
  • EJBs suxx0r (Score:3, Informative)

    by tedgyz ( 515156 ) * on Monday December 11, 2006 @07:58AM (#17193126) Homepage
    The article could say ANYTHING vs. EJBs is faster. I love Java and have successfully built several websites, but I feel EJBs are the antichrist. I avoid EJBs like the plague.
    • by Gr8Apes ( 679165 )
      EJBs, and many other things as well. EJBs violate KISS on many levels, not the least of which is avoidance of boilerplate code. <-- read as "maintenance nightmare"

      A lot of the "helpful kits" purport to make your life easier by vendor lockin to a dev kit with tons of boilerplate code that would have to be painfully changed on migrations or even upgrades <-- read this as "non-portability: maintenance nightmare part 2"

      Lastly, a single appserver on modern relatively inexpensive hardware can easily serve 5
  • Hmmm... (Score:2, Interesting)

    by Anonymous Coward
    A new DriverManager and a new db connection for every request? Welcome to 1998.

    Even though they return 1000 rows, 50 requests per minute is pretty poor. Voca processes 80 million bank payment per day using Spring [infoq.com].

    • ... and the author didn't recommend this technique for small transactions. In fact, he specifically said don't do it for those types of requests. Making a new connection for every VERY LARGE request might make sense depending on the scenario. If it takes 0.25 secs to login and get a connection, and 3 minutes for the database to execute a query and return 10,000 rows, then who cares about the 0.25 seconds?
  • by milton.john ( 604556 ) on Monday December 11, 2006 @08:00AM (#17193134)
    Anyone interested, please check the source code. If I understand it correctly, in his benchmark, he is not using any EJBs, he simply "simulate" this by adding serialization+deserialization to the code. It's quite optimistic to call this benchmark at all. Is it really that surprising that when I order computer to do X and then do X + something more, that it will be slower than the first case?

    Adding layers to architecture is not primarily done to increase performance, but to create clean and easy to maintain design. If the implementation is not performing as required, it should be profiled and only then critical parts should be optimized for performance. If somebody in my team would dare to write code similar to this "streaming architecture" (read: plain old servlet with database access and model object polluted with html tags) it would be his last contribution to the project.
    • Re: (Score:2, Insightful)

      by sonofagunn ( 659927 )
      I think you're missing the point of the article. Whether he is using EJBs or just an EJB-like pattern, the analysis of the algorithms' scalability is the same. It's common sense, but a lot of people who are trained in EJB development always do things the same way without really thinking about what's actually happening behind the scenes. This article makes a good case for using a database cursor and "streaming" objects to the web page as you read them, as opposed to letting a bean read the entire result s
      • Re: (Score:2, Informative)

        by milton.john ( 604556 )
        I agree with you, but after so much simplification (and I am not 100% sure if correct simplification - he is in fact comparing data transfer object design pattern with straight jdbc in servlet), it's pretty meaningless to work with any numbers and comparing results. Good point of that article is to show that advantages (such as clustering, failover, participation in transactions) does not come without a cost. That is true.
        But as a developer, you should always consider if you need technology like EJB. EJB an
        • Re: (Score:2, Informative)

          by sonofagunn ( 659927 )
          His only point is not the actual quantifiable values of the numbers, but rather how they change with load. If you were to plot the different techniques on a graph, you would hopefully see a linear line for the streaming, whereas the other techniques would have either a steeper slope or an exponential curve. It's the slope/shape of the line that's important when analyzing scalability, not the actual numbers.
    • by Laz10 ( 708792 )
      A good EJB or Hibernate based implementation would be caching all the president objects in no time.
      Then the EJB/Hibernate implementation would wipe the floor with his stupid freshman code.

      This article doesn't deserve this kind of attention.

      All said reporting isn't the strong side of classic EJB or Hibernate based programs.

      I'd recommend that you write the CRUD dependent code of your application using Hibernate and use a reporting program with plain SQL to generate reports if you have a lot of data. EJB or Hi
      • You can't cache what you don't have memory to store...

        To have one and only one copy of each president pojo in cache and shared among all threads would work in this test because the POJOs are read-only, but in a more real-word scenario, you'd have to make copies of them before returning them from the cache or protect access to them with synchronized blocks, which would kill the scalability.

        So, now that you're copying the objects before returning from the cache, each request has its own copy of the president
  • This article is pretty bare when it comes to the title of the article. Out of all the content in the article there are few paragraphs that really talk this streaming architecture. I would almost go to say this isn't streaming cause this is just common sense. However it is just a name and that is my two cents. But I have included the article that pretty drives the author's point as the rest of the article is merely setup and post-analysis.

    So the moral of this article, if you read and process any decent am
  • After reviewing the comments, I realize I could have been clearer as to intent in the article. Unfortunately, most of the comments here miss the mark. Only sonofagunn "gets" the point.

    The article compares two classes of algorithms with asymptotically different memory usage:, traditional store-and-forward approaches against a streaming architecture. It is similar to comparing quicksort with bubble sort and saying that one is an (n log n) algorithm while the other is n^2. The point of the practical example is

"If value corrupts then absolute value corrupts absolutely."

Working...