Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Java Programming Technology

Should A High-Profile Media Website Abandon Java? 156

newbroom asks: "The company I work for runs a large, high-profile web site with users all across the world and delivers them large amounts of streaming media content plus textual stories. You might guess therefore that this is a news website, frequently updated throughout the day, and delivering content 24x365. No names, or course, for obvious reasons. We have a big, custom, Java content management system (based on a framework from a proprietary vendor as it happens, but could just as well be EJB/J2EE for all that it matters in the context of this argument) and for deployment we run our website using Java app servers on Solaris behind Apache." If you were going to take such a site from 1000 users, to 10,000 users, would you be able to do it using this kind of setup?

"It is all hugely expensive to license and to run, and it's not very scalable. We'd like to up our userbase from several tens of thousands to ten times that number - but the cost of scaling the Java/Solaris infrastructure is not trivial, because the Java servlet architecture costs too much in memory and execution time (creating several 100Ks of in memory objects for each logon is expensive stuff!). On current hardware we can support only 1200-1500 concurrent logins and scaling up requires a new app server (eg 1 processor + 1GB RAM) and a $20K software license for each additional 600-750 concurrent logged in users. And in today's 'cost per active subscriber' economics it doesn't add up - we cannot justify the present cost structure, by any rational measure, even before we try to scale it up.

So we're thinking of chucking it out and replacing it with a largely static site that is generated (written out to cache) from a new, simpler content management system. The few dynamic elements would be assembled using simple PHP scripts, frontending our existing Oracle DB server. We reckon we could serve vastly higher numbers, ten to a hundred times as many, of users on the same (or cheaper!) hardware: and it would be simpler by far to build and maintain and support.

I, personally, believe that the benefits of the Java system (rapid prototyping, development) are not important when large scale deployment is the issue. I am (as a user) fed up with large, poorly performing Java-based websites. My beef is not about Java the language though - it's a question of appropriateness. Fifteen years ago we'd prototype in Smalltalk and then code for deployment in C, and I feel the same applies here. The economics of the noughties do NOT support spending massive amounts of money on web infrastructure, unless the transactional revenue justifies it. Of course, most businesses generally don't justify it, in my opinion.

Our outsourcing partner who supports and maintains the architecture thinks we are crazy. Putting their potential loss of revenue aside they are hugely concerned that we'll not be able to support what we create. They are seriously against this idea.

I remember, prior to Java & the like, supporting simple CGI websites with tens & hundreds of thousands of users off of cheap FreeBSD systems, and we didn't have to pay an outsourced partner to do it.

So what does Slashdot think? What would you do if you, were in the same boat?"

This discussion has been archived. No new comments can be posted.

Should A High-Profile Media Website Abandon Java?

Comments Filter:
  • by DevilM ( 191311 ) <devilm@@@devilm...com> on Monday September 29, 2003 @10:58PM (#7091180) Homepage
    Seems like your problem is one of architecture and not the underlying platform. You suggest that you would move away from a dynamic site built with J2EE to a generated static site built with PHP. If you really feel having a generated static site is the best way to go then why not leverage your existing Java infrastructure and have it generate a static site instead of server a dynamic one? And if you can levarge your existing code base for that, then writing a new code base could still be done in Java, so I am not sure why you are pointing to Java as the problem.

    With all the above being said, I don't know what is wrong with your system, but it isn't that hard to build a dynamic site in Java that meets your scabilitiy needs. All you need is a good caching strategy and you are set. Generally speaking, a good caching strategy coupled with a dynamic site can lead to as good as or better peformance than a static site.
    • I agree with DevilM.

      A J2EE or even a lighter java servlet based solution may not be the best for your needs but it sounds like to me your "big, custom" content management system is at least somewhat bloated.

      Unless your system is very highly customizable by your users you should have all sorts of opportunities for caching and optimizations geared towards scaleability.

      It's not the same but the webmail for UF [ufl.edu] scales to 2,000-3,000 concurrent users during peak load with only one gig of ram. Unlike a news

  • Wrong place man! (Score:4, Insightful)

    by Lally Singh ( 3427 ) on Monday September 29, 2003 @11:07PM (#7091188) Journal

    Sorry dude, but you're going to get many of these:


    1. Solutions pulled from the ass
    2. Solutions that sound real good, but were pulled from the ass
    3. Crap

    For your case, I hope I'm wrong. Just out of curiousity, have you considered, I donno, profiling your application, to see where the time's being spent. Also, how about switching to a more cost-effecitve Java platform? I've heard good things about this little thing called Linux.



    Note:My few comments certainly fall into Category 1 from above.


    But, your current solution does work, right? How much exploration have you done in optimizing the application? Oracle can certainly scale, and you're already willing to strip down the site to mostly static pages. Why throw out all that proven and thoroughly tested code? And if you're outsourcer can't do that, maybe its time to switch partners, not platforms. You have a large investment here, there's a good chance you can save most of it.
  • Python (Score:2, Informative)

    by costas ( 38724 )
    Python is a refactorer's dream. You can transition your Java application to Jython re-using your Java classes while ironing out the bugs and design of the Python code, implementing caching, static HTML generation and the like.

    When you're done, swap the JVM out of Jython and run pure Python with debugged code. If Python gives you any performance trouble, write small C-based modules for your frequently used code and wrap it in Python (fairly easy to do).
    • Re:Python (Score:5, Informative)

      by RevAaron ( 125240 ) <revaaron AT hotmail DOT com> on Tuesday September 30, 2003 @01:38AM (#7091422) Homepage
      That is the worst attempt at my-favorite-language-cheerleading I've ever seen. Ok, it's not, but it's still pretty bad.

      1. Writing C for web apps is not a solution. The wrapping tools for Python aren't impossible to use, but they can't perform miracles. Yes, it is very easy to use an external C function for performing some repetitive math function, an FFT or something- but in a data-intensive web app, it really makes no sense. In the case of the poster's problem, he and his team would end up re-writing half of the framework their using in C, giving it Python interfaces. If they were having problem with just Java's raw execution speed, they could just as easily use Java's JNI to interface with C libraries.

      2. No matter good it looks on paper, going from a big system written in Java for one particular framework to a system written half in Python and half in Java doesn't make all that much sense. They'll be dealing with the same bottlenecks, the same bloat- it's all running on the JVM. If anything else, they'd increase the footprint and slow the app down, as they're adding on yet another layer of complexity.

      Yes, I am fully aware that Jython outputs Java bytecode itself, but Sun's Java compiler does a lot better generating efficient Java bytecode out of Java than Jython does. Nothing inherent to Python or Jython, but when you've got a multi-billion dollar project like Java, when you consider what Sun puts into it- then compare that to the miniscule (by comparison) project that is Jython, it'd be absurd to expect the same results.

      I know it's easy to get a little jumpy when the dude mentions PHP and your favorite language is Python, or hell, anything that isn't PHP. You want to come in any say "hey, use my favorite language!" Believe me, I'm wanting to do the same thing, and I could substitute the word "Smalltalk" for "Python" throughout your post, and it'd be just as true; unfortunately, so would my points against it.

      Python and Jython certainly have their places, no doubt. Depending on a couple factors, I may use Python to write my system intiailly, but simply having a language that spit out Java bytecode doesn't mean you have some non-trivial, seamless transition between two system.
      • Grandparent post is full of poo poo.
      • MOD THE PARENT DOWN (Score:3, Informative)

        by axxackall ( 579006 )
        The parent certanly does not have any experience (or a motivation) of doing right the refactoryng and migration form Java to Python.

        I've done it with two projects, one was heavily overbloated with EJB, another one was a typical JSP thing. In both cases I've moved to Python+Zope and it was done pretty quickly and smoothly.

        Well, I admit, I've done it without Jython, as I've found there was no need for old/new code temporary integration aside of transparent authentification (which was simple - through LDAP

        • Comment removed based on user account deletion
          • Yes it's it tru. In both projects the amount of concurrent authenticated sessions was bellow 2000 per cluster. The amount of concurrent requests (I hope you understand the difference) was bellow 200 per server (which is a pretty reasonable limit for many considered HTTP standalone listeners).

            We talked about Java vs Python and EJB vs Zope here. Tests I've made shoed that on the same hardware (P3-4x500MHz + 2GB RAM) both JSP-no-EJB (Tomcat) and JSP+EJB (Tomcat+JBoss) are actually slower than Python+Zope. Pa

      • Yes, I am fully aware that Jython outputs Java bytecode itself, but Sun's Java compiler does a lot better generating efficient Java bytecode out of Java than Jython does. Nothing inherent to Python or Jython, but when you've got a multi-billion dollar project like Java, when you consider what Sun puts into it [snip]

        The lack of static typing (inherent in the Python language) will make it slower. Methods will be looked up by name (in string form) instead of by VFT offset.

  • You really will get a lot of ideas, grief, you name it. You're attacking a bunch of sacred cows with this question.

    One problem I see. Who's the architect of this thing? Is s/he being held responsible for your scalability problems?

    Another: Does your vendor/outsource/partner actually know what they're doing? Where are their references recommending them for building a highly scalable site?

    The other thing is: tuning tuning tuning.
    • Ask your software engineers to do what a software engineer use to do: verify if the design was made thinking in scalability. If not, it doesn't matter if it's a good design for just two nodes or ten nodes cluster.

      Second: profile, profile, profile

      Third: well, almost anybody that has used a J2SDK (or JRE) on Solaris knows about its problems. Try to run Volano [volano.com]'s benchmark [volano.com] to know more about this. But like any banchmark, please don't believe your software will perform the same way the benchmark does. It is ju

  • by truffle ( 37924 ) on Tuesday September 30, 2003 @12:17AM (#7091239) Homepage

    You should be able to deal with a lot of your scalability issues by putting some kind of cache in front of your system, like Squid [squid-cache.org].

    But it sounds like every page on your site is really dynamic. And thus uncachable.oy

    But you want to replace it with a mostly static site, so obviously, not all that dynamic stuff is required.

    Before you chuck the baby out with the bathwater:
    - Can you revise your existing java site to serve most pages as essentially static?
    - If so, will putting some cheap squid cache boxes in front of your main servers do the trick?

    This technique really works, if you can do it.
    • Warning: ASSERTION fails for all known values in the universe
    • Caveat Emptor. If the issue is database latency, then caching can improve performance but at the cost of scalability.

      Why? Because a simple caching scheme precludes the use of clustering due to the problem of the dirty read.

      You can go with a more complex caching scheme that writes dirty objects back to the database but there are problems with this approach. Typically, you either decide that it's okay for the user to see data that isn't always current or you have to implement some kind of "heart beat"

  • by jsse ( 254124 ) on Tuesday September 30, 2003 @12:41AM (#7091262) Homepage Journal
    On current hardware we can support only 1200-1500 concurrent logins and scaling up requires a new app server (eg 1 processor + 1GB RAM) and a $20K software license for each additional 600-750 concurrent logged in users

    I'm afraid your company must seriously consider other J2EE platform, rather than root up your existing architecture.

    First of all, fuck SUN. I'm biased, of course, because I'm here to pro-Linux in this case. SUN's J2EE app server is almost the most expensive among their competitors, not to mention the incremental maintenance cost incurred by expensive SUN hardware. Nowaday big corps like IBM and HP offers enterprise support for J2EE on Linux platforms, and their support are M3(24/7) with at least 3 9's maintenance

    Also, you don't pay per user for large scale web deployment, you pay per server license. Fuck SUN's sales multiple timesfor not reminding you of better license terms for your new deployment.

    I remember, prior to Java & the like, supporting simple CGI websites with tens & hundreds of thousands of users off of cheap FreeBSD systems, and we didn't have to pay an outsourced partner to do it.

    You're just going backward in this case. Existance of J2EE platform is to solve various problems with CGI. One of our deployment just switch from CGI to J2EE due to the former behaved unstable when handling high volume requests. Of course, I've been told of many success with CGI, but J2EE seems to fit in in this case.

    Besides, I don't understand why you've scale-up problem with J2EE. Scalability is the major advantage of J2EE. In our most current project, we decouple RDBMS(Oracle), Web-Tier(Apache), App-server(9iAS) and EJB containers(OC4J) into 4 seperated Linux cluster pool and one share storage of SCSI raw disks. We could easily scale up our architecture on various requirements.
    • Also, you don't pay per user for large scale web deployment, you pay per server license. Fuck SUN's sales multiple timesfor not reminding you of better license terms for your new deployment.

      He specifically said it was a non-J2EE proprietary Java app server. My guess is ATG Dynamo (a pre-J2EE version). I doubt Sun's sales had anything to do with recommending it.

  • If you design a site in such a way that you're getting performance that bad. Well, it doesn't matter which language you use.

    Php, perl, c, etc. Are not the panacea your looking for, good architecture is what you need. Languages have their pluses, but it sounds like you just need a better design.

  • by Phouk ( 118940 ) on Tuesday September 30, 2003 @01:02AM (#7091297)
    No, you should not - throwing out a significant body of tested, working code in favour of "new, better!" code is not a good idea generally. A well-known article by Joel Spolsky eloquently explains some of the reasons [joelonsoftware.com].

    Instead, try to improve your current solution an bring its cost down:

    1. If you are scaling up by adding servers, see if you cannot add something like two Linux servers instead of one Sun (for a fraction of the price, naturally).
    2. You haven't said which software packages bring up the license cost (besides Oracle, of course). But for most, there are open source alternatives. Sure, they might take more work to set up in some cases, but certainly less than rewriting the whole application, no?
    3. You might even want to evaluate if you can replace your Oracle by a SAP DB [sapdb.org] instance if that is not your bottleneck (Hint: Caching! Caching!).
    4. If, as you say, hundreds of KB are used up for every logged in user, then, in all likelihood, there are big inefficiencies in your code. You should profile it / have it profiled for both CPU and memory efficiency. Then tune the 5% of the code that use 95% of the resources, instead of throwing away 100%.
    5. Are the outsourced programmers up to snuff? Maybe have the code checked by a third party (who could also do the profiling / tuning). Because a bad programmer can bring down any infrastructure, be it J2EE, .NET, PHP or whatever. It's the man, not the machine.

    Good Luck!
    • This is excellent advice (i.e. MOD UP).

      It sounds like some areas of the Java solution are giving particular problems, in particular with respect to memory. If this functionality is provided by the application server, you need to look at alternatives (WebSphere, JBoss, etc). If it is provided by your custom-build CMS then profiling and limited refactoring to make use of object pools, etc, should take you a long way.

      Most of the cost issues (apparently) relate to the application server and its host syste

  • Don't do it (Score:4, Insightful)

    by farnsworth ( 558449 ) on Tuesday September 30, 2003 @01:04AM (#7091302)
    You say that your 2 biggest problems with the current setup are cost and performance. A rewrite in a new language will not necasarily solve either problem. If you want to tackle cost, look at free/cheap app servers. If you want to tackle performance, look at the code. Why are you using hundreds of kilobytes of ram for each user? That's a huge red flag and an indication that perhaps your application is either not optimal, or legitimately doing some major leg work for each user. Either way, a rewrite won't fix those problems unless you understand what they are.

    Think of the risks that a rewrite introduces:

    you break existing business logic with the new implementation

    you build a system that is slower than the existing one

    you take way too long to finish it, all the while you have to pay your existing licences

    Typically, the argument for a rewrite is that the cost of implementing new functional requirements is higher than the cost of just implementing them in a brand new system. Have you tried optimization? How are you maintaining session state? Do you know what it would take to get your app running in a free container? Have you looked at free/cheap caching APIs?

    Further, java may not be the out-of-the-box fastest platform, but there is no reason whatever that it can't scale in an environment designed for it. Yes, you may need to have many smaller machines because of jvm memory issues, but that's exactly what you should want. The ideal situation is when you can say, 'if we need to support 10x the current users, we just need to drop in 10x more app servers.' It's called 'scaling linearly.'

  • or shouldn't this sort of thing be under 'Ask Slashdot'... that's what that section is there for, is it not?
  • So what does Slashdot think?

    I'm so glad you asked. Given the scant details, this should be great fodder for a Java/PHP/whatever flame war. Seriously, you expect good advice based on that. Having said that, look at some 'real' big sites and see if they use Java: yahoo, google, slashdot, etc.
  • yes, switch (Score:5, Interesting)

    by consumer ( 9588 ) on Tuesday September 30, 2003 @01:37AM (#7091418)
    It doesn't have to be to PHP, it could just be to an open source Java platform, but get off your expensive proprietary platform before it drives you into the ground. Java has good enough performance when done well, but most commercial Java frameowrks make it hard or impossible to write anything that isn't bloated, so ditch that thing. If you are good with PHP and your site is reasonably simple that should work fine. So would Python, Perl, simple Java servlets, etc. Static publishing (from a database) is a great idea if you can get away with it.

    Ditch Solaris and go with Linux or FreeBSD on Intel hardware. Amazon and AOL did it and saved buckets of money, so you should feel confident that you can do it too.

    • Why should they toss out Solaris on their Sun hardware when they've already spent a helluva lot of money on that system? Why not work with what they have?

      This has come up a million time I imagine. The vast majority of companies will not ditch a huge hardware, software and training investment on a system like Solaris so that they can have the feel-good-vibe of knowing that they are using an open source platform. They couldn't give a rat's ass if Linux is open source or not. Sure, they could install Linu
      • Keeping the Solaris hardware costs them money. There are licensing fees, support contracts, and it's all being outsourced right now so they may not even own it. Replacing it with systems that have zero licensing costs seems like a big win to me, but only they know enough the economics involved to say for sure.
        • Solaris has no licensing fees. Any Linux box they bought would have a support contract too if they wanted it - there's nothing forcing them to have one for their Sun boxes either. Looking at kit like the V210, V240 or V440, the Solaris kit would cost the sameheaper than the Intel kit anyway.
  • I have to wonder, seriously, if this is the right forum for such a question. You're asking people to make a snap decision on very little information and in all seriousness, it's something you're getting paid to decide, not us. There's plenty of information out there on the web about doing this, and you're asking us to do your research. /end soapbox
  • If you are looking at a rewrite of the proprietary CMS, it's probably not that cost effective if you are only paying 20k/server (one time or yearly?) + maybe a 3k/year maintenance/service cost. Figure you will need to hire at least two, possibly up to 4 developers to write the new CMS, including importing all of the existing content, user information, preferences, history, etc. Even then, you'll be responsible for any upgrades you need. Have you asked the vendor about the performance problem? You're proba
  • It sounds like your company was one of the early adopters of content management systems.

    Most of the early CMS's built during the boom times they were large, very flexible and complex systems that tried to be all things for all people.

    I think products like ATG Dynamo are great but they tend to be very over-engineered for small to medium size sites and maintaining them becomes a nightmare of inter-dependencies. It doesn't mean they won't perform and scale, just that you need very experienced people that k

  • by Zachary Kessin ( 1372 ) <zkessin@gmail.com> on Tuesday September 30, 2003 @02:24AM (#7091598) Homepage Journal
    It sounds like you have a shop full of good Java people. While you may want to change how things are run, I would not change languages. This is not based on any love for Java, but the fact that if you have a team of Java programers to get them to the point where they can write top flight code in something else will take time that you can better spend on something else.

    But I would consider changing the architecture if that makes sense.
  • Not to be prying or anything, but what's your role in your organization ?

    CIO? Sysadmin? Architect ? Project manager? Consultant? Programmer?

    Do you make architecture/infrastructure decisions ?

  • by pi_rules ( 123171 ) * on Tuesday September 30, 2003 @02:57AM (#7091679)
    You have a J2EE (or something like it) based application that is non-portable in both it's host OS and host application server and on top of that doesn't scale too well because of memory/CPU requirements?

    Hmmm... somebody should be loosing their job. Either the consultants who built it or the person that approved such a thing.

    If you had a true J2EE app that wasn't coded by a team of monkeys on a wild rampage this shouldn't a problem. The "porting" to a new app server should be trivial, if anything has to be done at all. You'd be able to keep the Sun hardware and whatever app server you use on it, while chucking in Linux machines with Bea, WebSphere or maybe JBoss on them. Slap a hardware based load balancer in front of it and viola, horizontal scalability.

    I didn't see anybody else take offense to this, but 100k+ per user login memory usage? I might be showing my age, or rather my roots, but that seems excessive. My guess is your app's written like the app I now support. User logs in and everything about them is swallowed up into session (or application) wide collections immediately. The "lazy caching" thing just didn't cross these guy's minds. Of course, in my case neither did the "mark data dirty" thing but that's another matter.

    Please, somebody show me 100k worth of data that you would really want on-demand from-memory on a user at any given instant. Just a C struct or something would suffice.

    J2EE apps can be bleeding fast ultra lean sons of bitches if you do it up right. It can also be a dog-ass slow memory-hogging bastard. It just depends on who you had at the business end of the whiteboard I guess.

    Going the PHP/static generation/caching route isn't neccessarily a bad idea either... but I don't think you should have to do this. I'm seeing the maintence of such a system as a big onus on the system administrators to make sure everything is up all the time... I know of no PHP frameworks out there that would let you drop sessions from one system to another. I've never tried pushing PHP that far though.

    If your a system admin such a system might seem ideal... because while the systems and network might be a little "wonky" that's your domain and you feel comfortable supporting such a thing. I can't fault you for that; however I do think the onus is on the application development team. It is their job to make something scalable and construct it in a manner that it should fail over, recover, etc. from anything weird that may go on.

    This isn't realistic, but you probably purchased a scalable application toted as portable because it's Java and you didn't get that. Demand that. If they can't deliver boot them out the door and take it inhouse if you must but I see many obstacles in your path on the system admin side alone... and certainly the re-development of it won't be cake walk.

    Scalability problems are largely the development team's responsbilities, so long as such a requirement was put forth in the original development. Good system administration can help to reduce their errors along with a good helping of hardware but that's just a bandaid to the real solution.

    Just my two cents.
    • Slap a hardware based load balancer in front of it and viola

      Personally, I'd prefer a cello for this type of application.

    • Going the PHP/static generation/caching route isn't neccessarily a bad idea either... but I don't think you should have to do this. I'm seeing the maintence of such a system as a big onus on the system administrators to make sure everything is up all the time... I know of no PHP frameworks out there that would let you drop sessions from one system to another. I've never tried pushing PHP that far though.

      I personally have pushed PHP that far. From my personal experience PHP sessions provide a very clean

  • Broadvision? (Score:3, Insightful)

    by jag164 ( 309858 ) on Tuesday September 30, 2003 @03:36AM (#7091805)
    It almost sounds like you are stuck in a Boadvision nightmare....nortiously a lousy system with even worse scalibility issues. It'll scale alright....at an exponential rate that any normal J2EE solution will scale. Either start designing and coding a re-write with better CMS third party tools while keeping a few people on the BV maintainence side or be prepared to drop a few $100k into hardware per every 2000 concurrent users. *cringes* Good luck.

    Also, anyone who got suckered into a Broadvision sales pitch for a enterprise solution should be shot^H^H^H^H fired on first site.

  • How about... (Score:1, Flamebait)

    by KDan ( 90353 )
    Learning what you're blabbing on about. You seem to be a complete neophyte when it comes to actually doing something with Java. Are you a PHB by any chance (a particularly geeky type of PHB who reads slashdot...)?

    It's just so typical for people who don't have a clue about web application development to make stupid claims like:

    1) J2EE is for rapid development/prototyping of web apps. - J2EE is far slower than anything else I've used. The advantage of Java is that if you use it properly, it lets you creat
  • by buro9 ( 633210 ) <david&buro9,com> on Tuesday September 30, 2003 @04:52AM (#7091991) Homepage
    I've worked on some very large sites with concurrent users running into the hundreds of thousands... these range from http://www.btopenworld.com/ through to the UK's Football League clubs and premium content video sites.

    In my experience, Java was not the wisest choice, it was bloated, difficult to maintain (that's one that will rile the pro-Java camp), required too much focus on non-business focus areas (i.e. creating things like session pools and encryption when we should have been focusing on getting the actual business requirements fulfilled), created a object model bureacracy (pure OO with respect to encapsulation? or break the purity of the model because you know in advance that you want 27 objects and you could get them all from one piece of SQL, but this would have presumed knowledge on the internals of the object and thus have broken the rules of encapsulation).

    All in all, Java proved to be the most substantial factor in late deliveries of projects, limited scaleability... and expense (you wouldn't run Java on Windows, and we were running it on some very sizeable Sun boxes). We had several major works at performance improvements, memory caching, singletons to persist seldom-modified data, re-working SQL, etc. But this didn't help dig us out of the hole that we were in.

    As a comparison, we also ran some Windows boxes with ASP 3 code on it... used prolific file system caching, and because of poor OO support abandoned hope of properly creating encapsulation and objects purely... we did use re-usable components in DLL's, and we did do extensive work to cache page parts in both memory and on disk according to the predicted frequency of use.

    Both systems were behind reverse proxy caches... but the Java had the benefit of all pages being cached (as authentication ran in an NSAPI plugin on the proxy), whereas the ASP did not have its pages cached (just the images, styles, etc) as authentication code ran in the pages (it had not moved behind the plugin when I had left the company).

    Yet of these... the ASP consistently performed better on page generation times, concurrent users, etc... even though the ASP boxes were just cheap Compaq servers and the Java boxes were very over-specced Sun servers.

    My experience of all of this led me to the following conclusions... which were ever obvious but merely got re-inforced.

    1) Right tool for the right job. And at the moment that means considering things like PHP, Perl, ASP for web pages... not Java. String manipulation languages and those that are lower overhead are performing better on web sites.

    2) If you do use Java, be prepared to dilute the purity of the object model you create to favour performance. DO NOT get caught in the trap that the object model purity is more important than total performance/maintenance... OO purity does not necessarily equate to maintenance increase... documentation and commenting achieves that more.

    3) Cache everywhere! Parts of pages, generated pages, the images and styles used on pages, the queries in the database.

    4) Control your cache flushing fiercely! Do not allow apps to ever flush anything that you are not sure has to be flushed... wild-card flushing should never occur. If you stay in Java, implement the Observer pattern and persist and serialise data everywhere.

    Ultimately it comes down to architecture... but I have witnessed that Java encourages really strange architecture as everyone starts running after a holy grail of a pure object model.

    I would generally favour not using Java and going for the re-write. Other languages encourage pure string manipulation and control of what you're doing at a far more approachable level.

    Remember that you're only creating web pages:

    1) Query database
    2) Concat string
    3) Echo string
    4) ???
    5) Profit!

    It really isn't hard, and doesn't need rocket science. Look at /., we all love it, and it's on Perl!

  • by Anonymous Coward
    ... we need highly publicised failures, to counter MS marketing.
  • Update your resume (Score:3, Interesting)

    by smoon ( 16873 ) on Tuesday September 30, 2003 @05:51AM (#7092145) Homepage
    It sounds like you've got a site written around a very proprietary system, and that your scalability etc. is tied to what that proprietary system can do.

    The solution, therefore, is to get away from the proprietary system. But only if you think you can do better. Either find a better proprietary system or write your own. If you write your own then plan for 'scale out' on lots of servers running something cheap like *BSD or Gnu/Linux, Apache, Tomcat, JBOSS, posgres, mysql, etc.

    If you _can't_ get away from the proprietary app, then perhaps you can 'wrap' it in something else. Use static pages, PHP/mod_perl/C++/Lisp/jsp/whatever and a cheap but good database (mysql, postgres). Use these for all of the 'custom' content. Then have them access your 'back end' and dumb down the back end to get rid of everything that is not essential to a data feed. If possible aggregate the 'php' users into a few categories for the CMS to deal with. E.g.: have a 'sports' profile with 10,000 php users accessing a single 'sports' user on CMS.

    Try negotiating with the vendor. Perhaps you can present your 'success story' at a gartner symposium or somesuch. Complain about scalability. Demand a linux version. Get them to agree to some unlikely performance guarantee and use that to cut costs down (via penalties). Get some free consulting from them to help fix the problems. Make sure to wear a T-Shirt or use a pen from their major competitor whenever they are around -- much more fun that way.

    Find a failed .com that used the same proprietary system. Buy the company for pennies on the dollar and assume their license portfolio.

    Another approach is to update your resume and get the heck out of there.
  • Usually there are reasonable things that can be done to increase performance at a low cost. Optimize a query here, cache some data there, remove a big memory sink there, and next thing you know, your application is running a lot leaner.

    On the other hand, sometimes it is worthwhile to just chuck out the whole implementation and start over again. In my case, we're thinking of doing that because we want to implement a bunch of new functionality that really changes the intent of our application. And after 7
    • However, as an IT manager, I rarely find it "sellable" to take an internally-ugly application, replace it with another application of identical functionality, and tell anyone that it's a success.

      Wrong approach. You're not 'replacing' the app with an identical app, you're doing a 'major performance upgrade' to the app (new major version). If that new major version happens to share no lines of code in common with the previous version, so be it :-)

      As for replacing the hardware, I agree, there's no good r

  • by Bazzargh ( 39195 ) on Tuesday September 30, 2003 @06:48AM (#7092314)
    There's a lot of comments here to this effect already, I'm just going to add my voice.

    If you have 100s of K per login it almost certainly isn't the platforms fault, and it probably isn't the developer's fault either - all that memory must be going to customize content for the user, which means you can trace the performance problems back to the requirements. (your developers could be crap too, but profiling will tell you!)

    If the user gets content which requires a massive amount of customisation on each and every page - and this a requirement - then performance will suck no matter what the platform, as that memory will still need to be used.

    I've been through this before with a customer who demanded we try out every app server under the sun to resolve performance problems even though we showed him profiling figures that proved only 1% of the time per request was appserver overhead - 80-85% was in the DB, and the rest was the app code. Because the customer took a "religious viewpoint" that the appserver was wrong rather than believing at the profiling data, we wasted weeks.

    You need to profile before you can state that java is the problem - and equally, you need to profile before you can state that it's not.
    • If you have 100s of K per login it almost certainly isn't the platforms fault, and it probably isn't the developer's fault either - all that memory must be going to customize content for the user

      Then throw the user away, that's the faulty component. Or give him fixed, ugly content so he doesn't come back. Oh wait, the user is paying the job...
  • into seeing the problem in the wrong place. "[...] and for deployment we run our website using Java app servers on Solaris behind Apache."

    This isn't the problem.

    (creating several 100Ks of in memory objects for each logon

    This is.

    So what does Slashdot think? What would you do if you, were in the same boat?"

    Don't know what /. thinks, but I think you need a serius rethink of your application's design. It sounds to me like you're throwing away exactly the advantage that servlets can give you by creat

  • by mactari ( 220786 ) <rufwork.gmail@com> on Tuesday September 30, 2003 @08:24AM (#7092813) Homepage
    Looks like I need to bring Joel Spolsky's excellent article, Things You Should Never Do, Part I [joelonsoftware.com], to a new readership.

    The article speaks for itself, but essentially Joel's point is, "If it ain't broke, it's going to take you a heck of a lot longer to rewrite something inferior than you could've ever expected." Old code has tons of lessons learned that you'll never tease out. New code is easy to read and can implement every buzz word you'll find on O'Reilly Net right now, but it won't be battle-tested.

    If you're still able to even think about throwing out your old investment and moving to CGI and BSD, however, I'm thinking your site isn't doing much very fancy. If you don't have much customization invested in your propriatary system, what Joel and I are saying is moot, especially at the licensing fees you're mentioning.

    I'd also point out the title is very misleading. It's not Java that's the issue -- it's your system's architecture. Java is just as capable as creating a, "largely static site that is generated (written out to cache) from a new, simpler content management system," as language X. This is quite similar to the discussion we had about whether Java is an SUV [slashdot.org] just a while back (if it is an SUV, btw, that's not a bad thing [blogspot.com]). Your programmers' skillset is what's most important. If they already have a familiarity with Java, why ditch it?

    So, keeping true to the post that says the recommendations here come out our arse, here's another pulled from the same place:

    I'd recommend trying to refactor [extremeprogramming.org] your current codebase to do two things. First, try to implement your static page idea using your current system. Two, take out as much of the crappy, non-scalable system that happens to be written in Java as possible. You don't name the system, but the whole advantage of Java is that it doesn't need to be platform-specific (if done right). Ditch Solaris. Create a server-farm of cheap x86 hardware with Linux or BSD with a JVM installed. Reread your license -- if you have thirty "clients" (new Linux servers) making static pages from one legacy server's dynamic content, can you pay a lower fee?

    PS -- Who said Java was good for prototyping? Visual Basic (and vbscript/ASP or *gulp* ColdFusion), sure. REALbasic, sure. Java? Are you folks mad?!! ;^)
  • The problem is that a java-based solution is used to develop a highly scalable real-time system.

    This is just not possible.

    Whatever you Java-gurus would say, it won't scale, it won't run as fast as it could.
    Yes, perhaps development time is somewhat shorter in java, but if you want to tweak your app to run 0.75 as fast as a properly developed C++ solution, you would spend 5 times as much time profiling and looking for bottlenecks, when, in fact, the bottleneck is the java architecture itself.
    In fact, the be
    • You have to be crazy to recommend to a person who is complaining about tie in to a vendor solution and he should 'hire 5 experiences C++ programmers' and hope that they stick around for the future. When looking at languages and architecture you shouldn't go down a path whereby you end up over the barrel with 5 programmers telling you they want a 500% pay raise because they designed/wrote/implemented your webserver/application/infrastructure? Balance is the key
  • 1. Are you the programer?
    2. Does the programming staff know PHP?
    3. Could you easily change your site to be static pages?

    You might have a lot of solutions available. Have you looked at JBOSS? it is a much lower cost java app server. You could do JBOSS on Linux and save a bundle comparied to Solaris.

    PHP is a good a good solution to a lot of problems but it is not what I would choose to generate static pages. Perl or Python are the classic choices but you could get wild and use Java, Lisp, or even OCaml. You
  • There are a number of dynamic cache products that are developed just to solve your problem - from WARP Solutions (warpsolutions.com) just to name one. It's a much lower risk solution than to wholesale drop all your technology investment and replace it that could be just as bad, or worse!
  • Don't step down (Score:3, Informative)

    by Hard_Code ( 49548 ) on Tuesday September 30, 2003 @12:15PM (#7094817)
    Java/Servlets can absolutely handle the load. I sincerely question your suggestion to step DOWN to PHP. While PHP is great for small projects, it is pretty MISERABLE at scaling because it has a huge gaping hole of not supporting application persistence. The very thing you DO NOT want to do with PHP, is attach it to a database with lots of SIMULTANEOUS users, because PHP has little or no way of pooling resources (e.g. your database connections will scale in one to one ratio with your users == BAD THING).

    See Ace's Hardware articles on how they converted from PHP to Java/Servlets/JSP, it is a blow-by-blow walkthrough that reads like a HOW-TO:

    Building a Better Webserver in the 21st Century [aceshardware.com]
    SPECmine - A Case Study on Optimization [aceshardware.com]
    Scaling Server Performance [aceshardware.com]

    The move to a Java-based web application marks a giant step forward for our site software. While the "applications" we previously ran on Apache and PHP were little more than individual scripts interpreted by the webserver on request, the new site is in and of itself a complete, running, multithreaded application. When a request is made, the application starts a new thread to serve the request. Database connections are allocated as needed from a shared connection pool, maintained by the application.

    In the case of the interpreted scripts of old, programs were compiled and executed on the fly in a stateless manner. The scripts only ran when they were requested, and so there was no communication between threads or components and no sharing of resources.

    Our new software platform enables us to build true stateful applications that can create and share global resources. For instance, our message forums make use of a shared message index cache that, for all in intents and purposes, frees the database from nearly all read activity. The cache is shared in memory amongst all threads and it is only updated when a write operation is made to the database for a new posting, an edit, or a deletion. Such a cache would be very difficult to implement in something like PHP or PERL because its not possible to share persistent objects among different instances of an interpreted script.

    Our old web application was written in PHP and ran on Apache, a "pre-fork" multiprocess HTTP server. Apache works by starting a parent process which then forks several child processes to listen and wait for HTTP connections. Since, each of these child processes serves one HTTP request at a time, Apache creates a pool of processes to handle connections in a timely fashion.

    The disadvantage of this approach is that it can result in a great deal of overhead due to the 1:1 ratio between processes and requests. This can be particularly true in the case of HTTP keepalives, a feature designed to speed up web serving by handling multiple sequential requests from a client on the same connection, saving the time of having to build up a new connection for each request. The disadvantage comes into play when a child process is forced to wait a given amount of time on a client before accepting a connection from a different client. If the keepalive timeout is 15 seconds, then each Apache process will be unable to handle any other connections for 15 seconds following the final request from a client.

    This means an Apache web server using keepalives will need to have more child processes running than connections. Depending upon the configuration and the amount of traffic, this can result in a process pool that is significantly larger than the total number of concurrent connections. In fact, many large sites even go so far as to disable keepalives on Apache simply because all the blocked processes consume too much memory.

    Another issue with a multiproces

  • Your problem is that you contracted this out to a company that grew up during the dot com haze and they gave you a dot com CMS. Now they're more than happy to manage and maintain the beast.

    If you weren't using proprietary Java APIs (from BEA, ATG, Broadvision, etc.) you could get the same system running on inexpensive Intel-based machines with Linux or FreeBSD, Apache, JBOSS or JonAS (I can't believe nobody mentioned it yet) or even straight Tomcat with a Postgres DB on the back-end. All free/open source

  • Sounds like changes are infrequent... in which case a good caching strategy is your key.

    My suggestion would be to build a proxy/caching server and specifically set the no-cache directive for dynamic content. If you have dynamic content within most pages that would take a slightly different approach; I would suggest hosting dynamic content on seperate pages on the server and have the proxy server stitch the two together. A proprietary HTML extension (similar to IFRAME) and slight modification of the proxy
  • I'm no expert, and I'm strictly talking "off the top of my head" here, but I have a couple of ideas for you.

    First of all, I never understood why people who were serving up something static like news articles would want to store them in a database, fetching them dynamically with Java. They usually say "to permit searching" but there are other ways to do that (i.e. store some metadata and URLs in the database, and the actual articles in the filesystem). I think most people who write database apps do it becau
    • If you hire consultants, they'll go for maxing out their billable hours and their expense sheet. In-house guys are generally on salary and don't have much in the way of conflicts of interest (umm... IF, that is, you don't hire any fanatics).

      If you manage consultants correctly you can account for their time and this is not an issue. The salaried guys OTOH are impossible to get to do *anything*, their biggest decisions of the day are whether to complain about the coffee, the company, or the weather, especia

  • My company is a large-market daily-US newspaper, and we are building CMS systems in Zope [zope.org] & Plone [plone.org] (using Python). There may be several advantages to using a scripting language, but a shift from Java to a non-OO scripting language like PHP is likely higher risk for you - Zope (and the Zope Content Management Framwork) may offer a better solution given it has a toolset of components to leverage out-of-the box, and a simple, component-oriented way of developing content management applications with a script

  • You have already seen the problem, but do not want to acknowledge it. You are on a system that when it scales, costs lots of money.

    Several others have suggested refactering/rearchitecting. If you have not done so in a while, you may wish to do just that. You have suggested the idea that Java may be killing you, but then you point out that you use pure java on Solaris. Your real costs is the system with the use of a pure dynamic environment.

    If you truely have a significant amount of static or near static

  • The SPARC platform is expensive, and you can get a much better value by using x86 hardware.
  • 100KBytes per logon is ridiculous, and shows that the architecture/design of the dynamic page was poor and didn't consider scalability, resource-caching and sharing etc. Also, java with free application servers on linux x86 boxes is pretty cost effective.
  • Java really is best for enterprise level web work. Enterprise is a completely different ballgame compared to /. level stuff. ON /. you loose a bunch of trolls, in an enterprise you loose real $$ every time there's a mistake! and you HAVE to fix it.

    Java seems to have the most hooks to the really cool database functions necessary for enterprise web. This isn't MySql land anymore. You absolutely need persistance and triggers and a DB that takes care of itself. Right now that's a terrible hack in PHP t

  • Sounds like your architecture is fundamentally broken. I don't think java is the cause of the brokenness; I'm guessing that if you implemented the same architecture in PHP/ASP/PERL/COBOL it would suck.
    I have personally been in this position - I managed a development team which had inherited a large piece of web architecture (not java !) from a consulting firm just prior to launch (yeah, I should have seen it coming), and it didn't scale. At all. On our big, expensive, 4-processor sun boxen, we could support
  • I think that for Multimedia streaming and such Java is going to do the job a good deal easier than PHP.
  • solutions pulled from my ass. I don't tend to think Java is your problem, and what you are suggesting with the static content/caching/PHP is not a step forward by any means. In fact it sounds like a suggestion from your sysadmin, who is probably a PHP fanatic and does not know any better. Don't know how many times I have had to smack our sysadmin at my company for saying: 'Why does it have to be so complex? just do the whole thing in PHP!' For most people complex means any thing but [insert the only langu
  • If you're having problems like 100+K memory objects for a single user (have you got their whole family history in there?), then I suspect you're going to have a whole lot more problems when you try to do the same things in PHP.

    Something sound awfully skewed in that application. I suspect that your site does the dotcom era idea of "MyNewsite" and stores loads of info on the specific users layout etc. While this is fine and dandy, one does ask oneself why all this is sitting in memory all the time. It sounds
  • After reading this story, and all the comments, I've come to this conclusion:

    Good programmers can write good/fast/efficient code in ANY language. Crappy programmers write crap/slow/inefficient code in EVERY language.
  • Thanks 'newbroom'. This has been the most thoughtful discussion for some time on Slashdot.

    Please post again.

"If I do not want others to quote me, I do not speak." -- Phil Wayne

Working...