Forgot your password?
typodupeerror
Programming

A Fictional Compression Metric Moves Into the Real World 133

Posted by Unknown Lamer
from the best-thing-since-sliced-scatterplots dept.
Tekla Perry (3034735) writes The 'Weissman Score' — created for HBO's "Silicon Valley" to add dramatic flair to the show's race to build the best compression algorithm — creates a single score by considering both the compression ratio and the compression speed. While it was created for a TV show, it does really work, and it's quickly migrating into academia. Computer science and engineering students will begin to encounter the Weissman Score in the classroom this fall."
This discussion has been archived. No new comments can be posted.

A Fictional Compression Metric Moves Into the Real World

Comments Filter:
  • Bullshit.... (Score:5, Interesting)

    by gweihir (88907) on Monday July 28, 2014 @04:42PM (#47552863)

    A "combined score" for speed and ratio is useless, as that relation is not linear.

    • Re:Bullshit.... (Score:4, Insightful)

      by i kan reed (749298) on Monday July 28, 2014 @04:54PM (#47552941) Homepage Journal

      Well then write a paper called "an improved single metric for video compression" and submit it to a compsci journal. Anyone can dump opinions on slashdot comments, but if you're right, then you can get it in writing that you're right.

      • Re:Bullshit.... (Score:5, Insightful)

        by gweihir (88907) on Monday July 28, 2014 @04:59PM (#47552985)

        There is no possibility for a useful single metric. The question does obviously not apply to the problem. Unfortunately, most journals do not accept negative results, which is one of the reasons for the sad state of affairs in CS. For those that do, the reviewers would call this one very likely "trivially obvious", which it is.

        • This point comes up often in genetic algorithms, when more than one quantity should be optimized for. A common solution is to build a Pareto frontier [wikipedia.org], and declare them the best.

          A combination between two quantities is always a personal weighting. It may be useful, but it may also be limited in application. In the case here, the balance between compression speed and achieved size is too personal to be general-purpose, but perhaps the metric is useful for the use case of TV streaming content providers.

      • by Darinbob (1142669)

        I don't think this metric is really in any computer science journal, it's only in IEEE Spectrum.

      • Uhm, do you really think that something as important as assessing the performance of compression algorithms wouldn't have attracted the attention of thousands (or, more likely, hundreds of thousands) of computer scientists over the years? Open up any academic journal that deals with this stuff even tangentially and you find many examples of different metrics for assessing compression performance. And there's nothing new about this 'score'. Dividing ratio by the logarithm of the compression time is a very wi

    • Re:Bullshit.... (Score:5, Insightful)

      by nine-times (778537) <nine.times@gmail.com> on Monday July 28, 2014 @05:20PM (#47553137) Homepage

      Can you explain in more detail?

      I'm not an expert here, but I think the idea is to come up with a single quantifying number that represents the idea that very fast compression has limited utility if it doesn't save much space, and very high compression has limited utility if it takes an extremely long time.

      Like, if you're trying to compress a given file, and one algorithm compressed the file by 0.00001% in 14 seconds, another compressed the file 15% in 20 seconds, and the third compressed it 15.1% in 29 hours, then the middle algorithm is probably going to be the most useful one. So why can't you create some kind of rating system to give you at least a vague quantifiable score of that concept? I understand that it might not be perfect-- different algorithms might score differently on different sized files, different types of files, etc. But then again, computer benchmarks generally don't give you a perfect assessment of performance. It just provides a method for estimating performance.

      But maybe you have something in mind that I'm not seeing.

      • by jsepeta (412566)

        That's kind of like the Microsoft Windows Experience Index that is provided by Windows Vista / Windows 7 which gives a score based on CPU, RAM, GPU, and hard disk speed. Not entirely useful but gives beta-level nerds something to talk about at the water cooler.
        http://windows.microsoft.com/e... [microsoft.com]

        At work my desktop computer is a Pentium E6300 with a 6.3 rating on the CPU and an overall 4.8 rating due to the crappy graphics chipset.
        At work my laptop computer is an i3-2010M with a 6.4 rating on the CPU and an ove

      • Re:Bullshit.... (Score:5, Informative)

        by mrchaotica (681592) * on Monday July 28, 2014 @07:04PM (#47553795)

        Can you explain in more detail?

        If you have a multi-dimensional set of factors of things and you design a metric to collapse them down into a single dimension, what you're really measuring is a combination of the values of the factors and your weighting of them. Since the "correct" weighting is a matter of opinion and everybody's use-case is different, a single-dimension metric isn't very useful.

        This goes for any situation where you're picking the "best" among a set of choices, not just for compression algorithms, by the way.

        Like, if you're trying to compress a given file, and one algorithm compressed the file by 0.00001% in 14 seconds, another compressed the file 15% in 20 seconds, and the third compressed it 15.1% in 29 hours, then the middle algorithm is probably going to be the most useful one.

        User A is trying to stream stuff that has to have latency less than 15 seconds, so for him the first algorithm is the best. User B is trying to shove the entire contents of Wikipedia into a disc to send on a space probe [wikipedia.org], so for him, the third algorithm is the best.

        You gave a really extreme[ly contrived] example, so in that case you might be able to say that "reasonable" use cases would prefer the middle algorithm. But differences between actual algorithms would not be nearly so extreme.

        • Re:Bullshit.... (Score:4, Insightful)

          by nine-times (778537) <nine.times@gmail.com> on Monday July 28, 2014 @07:27PM (#47553917) Homepage

          Since the "correct" weighting is a matter of opinion and everybody's use-case is different, a single-dimension metric isn't very useful...[snip] User A is trying to stream stuff that has to have latency less than 15 seconds, so for him the first algorithm is the best.

          And these are very good arguments why such a metric should not be taken as an end-all be-all. Isn't that generally the case with metrics and benchmarks?

          For example, you might use a benchmark to gauge the relative performance between two video cards. I test Card A and it gets 700. I test Card B and it gets a 680. However, in running a specific game that I like, Card B gets slightly faster framerates. Meanwhile, some other guy wants to use the video cards to mine Bitcoin, and maybe these specific benchmarks test entirely the wrong thing, and Card C, which scores 300 on the benchmark, is the best choice. Is the benchmark therefore useless?

          No, not necessarily. if the benchmark is supposed to test general game performance, and generally faster benchmark tests correlate with faster game performance, then it helps shoppers figure out what to buy. If you want to shop based on a specific game or a specific use, then you use a different benchmark.

          • by Ardyvee (2447206)

            Why generate a score in the first place, when you can just provide compression ratio, compression speed, or in the case of the card: fps (at settings), energy used, consistency of the fps (at settings), along with any other characteristic you know or can test that doesn't combine two other things and let the user decide which of those things are better instead of trying to boil it all down to a single number?

            • by gweihir (88907)

              The uses for that single number are as follows:

              a) Some class of people like to claim "mine is bigger", which requires a single number. While that is stupid, most people "understand" this type of reasoning.
              b) Anything beyond a single number is far to complicated for the average person watching TV.

              In reality, things are even more complicated, as speed and compression ratio depend both on the data being compressed, and do that independently to some degree. This means, some data may compress really well and do

            • Depending on what you're talking about, providing a huge table of every possible test doesn't make for easy comparisons. In the case of graphics cards, I suppose you could provide a list of every single game, including framerates on every single setting on every single game. It would be hard to gather all that data, and the result would be information overload, and it still wouldn't allow you to make a good comparison between cards. Even assuming you ad such a table, it would probably be more helpful to

      • by gweihir (88907)

        It depends far too much on your border conditions. For example, LZO does compress not very well, but it is fast and has only a 64kB footprint. Hence it gets used in space-probes where the choice is to compress with this or throw the data away. On the other hand, if you distribute pre-compressed software or data to multiple targets, even the difference between 15.0% and 15.1% can matter, if it is, day 15.0% in 20 seconds and 15.1 in 10 Minutes.

        Hence a single score is completely unsuitable to address the "qua

        • Hence a single score is completely unsuitable to address the "quality" of the algorithm, because there is no single benchmark scenario.

          So you're saying that no benchmark is meaningful because no single benchmark can be relied upon to be the final word under all circumstances? By that logic, measuring speed is not meaningful, because it's not the final word in all circumstances. Measuring the compression ratio is meaningless because it's not the final word in all circumstances. The footprint of the code is meaningless because it's not the final word in all circumstances.

          Isn't it possible that a benchmark could be useful for some purpose

          • by gweihir (88907)

            Whether measuring speed is a meaningful benchmark depends on what you measure the speed of, relatively to what and what the circumstances are. There are many situations where "speed" is not meaningful, and others that are limited enough that it is.

            However, the metric under discussion will not be meaningful in any but the most bizarre and specific circumstances, hence it is generally useless. For the special situations where it could be useful, it is much saner to adapt another metric than define a specific

            • I find it surprising and almost funny how much ire this has drawn from people with some kind of weird "purist" attitude about the whole thing.

              It doesn't seem "generally useless" to me, but it would be more appropriate to say that it's "useful only in general cases". I would say that in most circumstances, I'd want compression algorithms that balance speed and compression. I often don't zip my files to maximum compression, for example, because I don't want to sit around waiting for a long time in order to

              • by gweihir (88907)

                The ire is because quite a few people cannot distinguish fake TV science and engineering from the real thing anymore. This "metric" is a high-quality fake and completely useless.

                • Well no, the metric is real. The question would be whether it's useful or meaningful. You originally implied that it wasn't because:

                  A "combined score" for speed and ratio is useless, as that relation is not linear.

                  It seems now that it's not about the relation being linear, but about something else that you won't say. I'm afraid I'm not closer to understanding.

                  • by gweihir (88907)

                    You really do not get it, I agree. This metric is useless. It follows the definition of a metric, true, but it has no reasonable practical use, hence it does not deserve any special distinction, like being given a name. That is what is fake here.

                    • Ok, I was giving you the benefit of the doubt, but it seems your argument boils down to "It's useless because I say it's useless. Nevermind that you earlier pointed out that it could be useful, because I decided that it's useless."

                      Glad we got that sorted out.

                    • I suspect another variable may be at play here: "Ambiguity Tolerance".

                      This Weissman Score may provide a great test to determine where someones tolerance for ambiguity is, based on how useful or useless they think a metric like this might be- but then, if the WS becomes useful for determining AT it will then become less useful for determining AT because it's perceived usefulness will have increased, which will then make it more useful... I think I feel my AT falling.
      • by sootman (158191)

        I'd just say it's useless because no two people can agree on what's important, so what's the point of giving a single score? And even something as seemingly simple as a compression algorithm has more than just two characteristics:
        1) speed of compression
        2) file size
        3) speed of decompression
        4) does it handle corrupt files well? (or at all?)

        Even just looking at 1 & 2, everyone has different needs. Some people value 1 above all others, some people value 2, and most people are somewhere in between, and "some

        • there's not a meaningful way to pick the "best" in that group that everyone will agree on

          Metrics often don't provide a definitive answer about what the best thing is, with universal agreement. If I tell you Apple scores highest in customer satisfaction for smartphones last year, does that mean everyone will agree that the iPhone is the best phone? If a bunch of people are working at a helpdesk, and one closes the most tickets per hour, does that necessarily mean that he's the best helpdesk tech?

          It's true that a lot of people misuse metrics, thinking that they always provide an easy answer, w

      • It depends in the situation where it is used. If your data almost but not quite fits on your available media at 15%, and you're not pressed for time, you might still go for 15%. And if you only have 15 seconds to compress it, strictly no more, you might settle for significantly less compression than would be possible in 20 seconds.
      • by loufoque (1400831)

        very high compression has limited utility if it takes an extremely long time

        I don't see how the utility is limited.
        Most content is mastered once and viewed millions of time.

        How much time it takes to compress is irrelevant, even if you get diminishing returns the longer you take. What's important is to save space when broadcasting the content.

        • How much time it takes to compress is irrelevant, even if you get diminishing returns the longer you take. What's important is to save space when broadcasting the content.

          Well, and also that it can be decompressed quickly and with little processing power, or else with enough hardware support that it doesn't matter. Otherwise, it'd take a long time to access and drain power on mobile devices.

          • Compression and decompression are different things.

            • Ok, so let's start from where you're wrong that "What's important is to save space when broadcasting the content." There are other important things.

              Next, what would you like to do then? Change this benchmark to measure decompression speed rather than compression speed? Sure, fine. Let's do that.

              • by loufoque (1400831)

                Decompression time is always real time. That's obvious.
                Compression is a whole different beast. Some applications need real-time encoding (such as video-conferencing), but most do not.
                Have you even ever written an encoder?

                • Decompression time is always real time? So it doesn't matter what computer, what processor, the size of the file, the complexity of the file, or even what kind of file it is? Or do you mean that it needs to be able to be done in real-time (or faster) for some particular use a a particular kind of file on a particular platform that you have in mind?

      • Can you explain in more detail?

        I'm not an expert here, but I think the idea is to come up with a single quantifying number that represents the idea that very fast compression has limited utility if it doesn't save much space, and very high compression has limited utility if it takes an extremely long time.

        Like, if you're trying to compress a given file, and one algorithm compressed the file by 0.00001% in 14 seconds, another compressed the file 15% in 20 seconds, and the third compressed it 15.1% in 29 hours, then the middle algorithm is probably going to be the most useful one. So why can't you create some kind of rating system to give you at least a vague quantifiable score of that concept? I understand that it might not be perfect-- different algorithms might score differently on different sized files, different types of files, etc. But then again, computer benchmarks generally don't give you a perfect assessment of performance. It just provides a method for estimating performance.

        But maybe you have something in mind that I'm not seeing.

        A compsci sacred cow being slaughtered. See there is nothing wrong with what you suggested. Thats the reason why the idea was inserted into Silicon valley to begin with. So why the bitching about its usefulness? People who spend time in computing as a whole are a fairly rigid lot. A lot of the have aspergers syndrome which gives them a leg up on coding while taking away their socialization skills. Others think its useless because they would prefer terms that dig deeper into the compression and its velocity.

    • by ultranova (717540)

      A "combined score" for speed and ratio is useless, as that relation is not linear.

      A combined score could be quite useful when implementing, for example, compressed swap. Obviously you'd need to calibrate it for the specifics of a case.

      • by gweihir (88907)

        When you "calibrate" swap for specific uses, it becomes non-general. In that situation it is far better to let the application use on-disk storage, because _it_ knows the data profile. Sorry, but fail to understand swap.

        • by ultranova (717540)

          When you "calibrate" swap for specific uses, it becomes non-general.

          Metric, not swap. I'm talking about compressing memory pages before swapping out, possibly to another memory region, and calibrating the metric to balance between CPU cycles used vs. disk traffick saved, possibly dynamically.

          In that situation it is far better to let the application use on-disk storage, because _it_ knows the data profile.

          And the OS knows the general state of the system. Also, virtual memory systems are far from trivial to

          • by gweihir (88907)

            Really, you do not understand what makes swap slow or fast. Go play somewhere else.

    • by sg_oneill (159032)

      A "combined score" for speed and ratio is useless, as that relation is not linear.

      Typing at 70 words per minute, slashdot poster declares quantity over time measurements meaningless.

    • by hey! (33014)

      It doesn't have to be linear to be useful. It simply has to be able to sort a set of choices into order -- like movie reviews. Nobody thinks a four star movie is "twice as good" as a two star movie, but people generally find the rank ordering of movies by stars useful provided they don't read to much into the rating. In fact the ordering needn't be unique; there can be other equally useful metrics which order the choices in a slightly different way. *Over certain domains of values* minor differences in or

  • by Anonymous Coward

    I thought I read an article the other day that said their algorithm seemed plausible on the surface but would eventually would begin to fall apart?

    • The fictional compression algorithm doesn't work. The metric for rating compression algorithms does work (insofar as more compressed/faster algorithms achieve a better rating).
      • When talking about lossy compression for video it might technically work but it's still worthless. For example my highly proprietary heavily patented postage stamp algorithm reduces all video down to 90 era dialup rate mpeg 2 aka a blurry postage stamp. This means it's massively compressed and very quick so it scores high on both metrics. It also looks like crap. Output quality and ratio are generally the metrics that matter and output quality is a subjective factor that needs to be determined by humans

        • by khellendros1984 (792761) on Monday July 28, 2014 @07:12PM (#47553849) Journal
          FTA:

          And Jerry Gibson, a professor at the University of California at Santa Barbara, says he's going to introduce the metric into two classes this year. For a winter quarter class on information theory, he will ask students to use the score to evaluate lossless compression algorithms. In a spring quarter class on multimedia compression, he will use the score in a similar way, but in this case, because the Weissman Score doesn't consider distortion introduced in lossy compression, he will expect the students to weight that factor as well.

          The scoring method as stated is only useful for evaluating lossless compression. One could also take into account the resemblance of the output to the input to allow a modified version of the score to evaluate lossy compression.

  • by retchdog (1319261) on Monday July 28, 2014 @04:46PM (#47552879) Journal

    The so-called Weissman score is just proportional to (compression ratio)/log(time to compress).

    I guess the idea is that twice as much compression is always twice as good, while increases in time become less significant if you're already taking a long time. For example, taking a day to compress is much worse than taking an hour, but taking 24 days to compress is only somewhat worse than taking one day since you're talking offline/parallel processing anyway.

    The log() seems kind of an arbitrary choice, but whatever. It's no better or worse than any other made-up metric, as long as you're not taking it too seriously.

    • The formula is not too bad, although I would suggest a minor tweak, mainly that one should change it from:

      (compression ratio)/log(time to compress)

      to:

      (compression ratio)/log(10+time to compress).

      This will ensure that no divide by zero occurs, specifically if the time to compress is 1 second, then you would have been dividing by zero in the original formula.

    • (compression ratio)/log(time)

      I guess the idea is that twice as much compression is always twice as good, while increases in time become less significant if you're already taking a long time.

      Yeah, I guess I empirically decided this for myself way back with DOS PKZip v0.92: either FAST because I want it now, or MAXIMIZE because I'm somehow space limited and don't care how long it takes. The intermediate ones (and for WinZip, WinRAR, 7z, and the others) are useless for me; either SIZE or SPEED, there IS nothing else.

      (Unless you can do somehow delete or omit it; nothing's faster than not doing it to start with.)

      And look -- they're using logs! Now when someone on the show talks about some cu

  • From the article:

    Misra came up with a formula

  • Not only does it fail to account for loss or distortion, but also fails to consider the time to decompress. If a compression algorithm with a high Weissman score is applied to a video, it is useless if it cannot be decompressed fast enough to show the video at an appropriate frame rate.
    • No metric is adequate for all purposes. This one is adequate for the task it was designed for, and is adequate for some other purposes as well. That's the best that can be expected of any tool. Always use the appropriate tools for the task at hand, of course.
      • by retchdog (1319261)

        It was designed as a background prop for a TV show. Not a very high bar.

        It might be adequate as an artificial evaluation metric for homework in an "Intro to Data Compression" class. It might be, because it hasn't even been used for that yet.

        I wouldn't exactly call this a tool. For example, it would be really easy to game this 'score' if there were any significant incentive for doing so. That's usually a bad thing.

      • by fnj (64210) on Monday July 28, 2014 @06:01PM (#47553429)

        The reason the Score is utter bullshit is that the scale is completely arbitrary and useless. It says that 2:1 compression that takes 1 second should have the same score as 4:1 compression that takes log(2) seconds, or 1 million to 1 compression that takes log(1 million) seconds.

        WHY? State why log time is a better measure than straight time, or time squared, or square root of time. And look at the units of the ratio: reciprocal log seconds. What the hell is the significance of that? It also conveniently sidesteps the variability with different architectures. Maybe SSE helps algorithm A much more than it does algorithm B. Or B outperforms A on AMD, but not on Intel. Or maybe it is strongly dependent on size of source (there is an implicit assumption that all algorithms scale linearly with size of source; maybe in actual fact some are not linear and others are).

        In real life, for some compression jobs you don't CARE how long it takes, and for other jobs you care very much. Or imagine an algorithm that compresses half as fast but decompresses 1000 times faster. That doesn't even register in the score.

        It's bullshit.

        • by Obfuscant (592200)

          And look at the units of the ratio: reciprocal log seconds.

          The Weissman score is actually unitless. When one divides "log seconds" by "log seconds" the units cancel.

          It also conveniently sidesteps the variability with different architectures.

          If one measures the compression ratios and times for the same data on different architectures, one is measuring the score of the different architecture, not "sidestepping" it.

          Maybe SSE helps algorithm A much more than it does algorithm B.

          Then algorithm A compared to B would have a higher Weissman score on a system with SSE.

          Or B outperforms A on AMD, but not on Intel.

          Then the score would favor B over A when comparing the two processors. That's what the score is supposed to do. It compares two things.

          In real life, for some compression jobs you don't CARE how long it takes, and for other jobs you care very much.

          Th

          • by Lehk228 (705449)
            decompression speed is unimportant for general purpose compression, it is either adequate or not adequate, if deompression speed is not adequate it does not matter how well it scores on other metrics it is unusable for your use case, if decompression speed id adequate, it really does not matter if it's just barely adequate or insanely fast.
          • by fnj (64210)

            The Weissman score is actually unitless. When one divides "log seconds" by "log seconds" the units cancel.

            That is because it is presented as the ratio of the figure of merit of the candidate algorithm to the figure of merit of some bullshit "universal compresser", times a completely useless "scaling constant". To strip away the obscuration, all you have to do is see that for a completely transparent effectless compresser, r is unity and log t is log 0, or unity. 1/1, and it drops out.

            The underlying figure o

            • by swillden (191260)

              some bullshit "universal compresser"

              Not a universal compressor, a standard compressor, such as gzip. The metric is ultimately just a comparison between the compressor being evaluated and the compressor chosen as the standard, and it is unitless.

              That said, I agree with you that the scaling constant has no reason to be present. As for using the logs of times... I don't know. It's essentially a base change, expressing the time of the compressor being evaluated in the base of the standard compressor, which is then multiplied by the ratio of th

            • by Obfuscant (592200)

              The underlying figure of merit once you cut through the bullshit is r / log t. r is the compression ratio (unitless) and log t is log seconds. So yes, the units of the underlying figure of merit are reciprocal log seconds.

              The fact that the actual equation is a ratio between a proposed compression implementation and a reference is a hint that it is not a "figure of merit" in absolute terms, but only with respect to some common standard. Yeah, you get to pick your standard, but simply reporting r/log(t) is meaningless. The actual measurement is unitless simply because, as you point out, units of 1/log(s) is meaningless.

              It's done that way so things can be repeatable. If I create a compressor and report a Weissman of 3, then y

            • You should never take the logarithm of a dimensionful quantity like seconds. Clearly some choice of units is implied and really we should have log(t/1s) or log(t/1ms) or something which would then make the score unitless.

              You need learn to cut through the hocus pocus and analyze the actual underlying equation before the Oz Sauce is ladeled on. You can well imagine that those who actually understand programming metrics are holding their sides laughing at those who are taking it seriously.

              and you need to go take some remedial math lessons if you think log(0) = 1.

        • by TubeSteak (669689)

          Maybe SSE helps algorithm A much more than it does algorithm B. Or B outperforms A on AMD, but not on Intel. Or maybe it is strongly dependent on size of source (there is an implicit assumption that all algorithms scale linearly with size of source; maybe in actual fact some are not linear and others are).

          In real life, for some compression jobs you don't CARE how long it takes, and for other jobs you care very much. Or imagine an algorithm that compresses half as fast but decompresses 1000 times faster. That doesn't even register in the score.

          The things you mention have always been left as an exercise for the reader.
          What benchmark isn't tagged with qualifiers that explain what it does and doesn't mean?

          Marketing literature in computing has always been littered with metrics that are completely useless unless you know how to interpret them in the context of what you want to be doing.

  • by JoeyRox (2711699) on Monday July 28, 2014 @05:00PM (#47552989)
    Two scores would be useful, one for compression_time:size and decompression_time:size, since for many applications the latter is more important in compress-once consume-many applications.
  • IIRC, the Drake equation was also a 'spitball' solution whipped off the cuff to address an inconvenient interviewer question. Subsequent tweaks have made it as accurate and reliable as when it was first spat out upon the world - and about as useless.
    • by Rockoon (1252108)

      IIRC, the Drake equation was also a 'spitball' solution whipped off the cuff to address an inconvenient interviewer question. Subsequent tweaks have made it as accurate and reliable as when it was first spat out upon the world - and about as useless.

      At least the Drake equation attempts to count something. I think people are missing this important fact about this bullshit compression rating: It isnt counting anything.

  • circle jerk (Score:2, Funny)

    by Anonymous Coward

    Show About Self-Absorbed Assholes Who Think Their Stupid Ideas Are The Bees Knees Gains Popularity By Making Their Stupid Idea Sound Like Its The Bees Knees

    • by Anonymous Coward
      Or simply SASAAWTTSIATBKGPBMTSISLITBK for short. What, are you some kind of pompous jerk who tries to sound smart saying it in full when all of us know it by the acronym?
  • Sounds a bit like the f1 measure used in classification systems, where the F-score is the harmonic mean of precision and recall. (where trying to higher precision yields lower recall and vice-versa)
    however, I'm wondering how stable this Weissman score is. Compression algorithms might not all perform O(n) where n is size of data to compress.
    Or it may actually give a very high score to something that doesn't compress at all.
    public byte[] compress( byte[] input) { return input;}
    I bet this gets a high Weis
  • Oh boy. A useless metric!

    Compression ratio: Sure. But the problem is, it's possible to increase compression ratio by "losing" data. So you can obtain a high ratio, but the images as rendered will be blurry/damaged.

    Compression Speed: This is just as dumb since compression speed is partially a function of the compression ratio, partially a function of the efficiency of the algorithm and partially a function of the amount of "grunt power" hardware you throw at it. So one portion of this is a nebulous "hard

    • by Chas (5144)

      Actually replaced with a better example.

      Took an 8.1MB TGA file and did three things.

      1: Saved the first off as a PNG file. Resulted in a 1.7MB file with lossless compression.
      2: Saved the file off as a high-compression JPEG. Resulted in a 46K file that's noticeably blurry and indistinct.
      3: Downsampled to 19x11 and back up to 1920x1080 and saved as a high compression JPEG (36K file) or a lossless compression PNG (114K file). Labelled this method UCCT (Ultra Crappy Compression Technique).

      Amalgamated the thre

      • This would be correct if the score wasn't being used for lossless compression where the only two variables that really matter are time/size.
  • Given that only a subset of Slashdot users are HBO subscribers, how is this relevant?

  • I couldn't watch the first episode. Quit maybe 10 minutes into it. Does anyone here actually enjoy the show and think it's any good?

    • "I couldn't watch the first episode. Quit maybe 10 minutes into it. Does anyone here actually enjoy the show and think it's any good?"

      I stayed with it and watched a number of episodes, I thought it caught the techie zeitgeist brilliantly. There's even a semi-aspie tech tycoon in there, just like you-know-who.

The only difference between a car salesman and a computer salesman is that the car salesman knows he's lying.

Working...