Forgot your password?
typodupeerror
Programming

A Fictional Compression Metric Moves Into the Real World 133

Posted by Unknown Lamer
from the best-thing-since-sliced-scatterplots dept.
Tekla Perry (3034735) writes The 'Weissman Score' — created for HBO's "Silicon Valley" to add dramatic flair to the show's race to build the best compression algorithm — creates a single score by considering both the compression ratio and the compression speed. While it was created for a TV show, it does really work, and it's quickly migrating into academia. Computer science and engineering students will begin to encounter the Weissman Score in the classroom this fall."
This discussion has been archived. No new comments can be posted.

A Fictional Compression Metric Moves Into the Real World

Comments Filter:
  • Re:It really works? (Score:5, Informative)

    by phoenix_rizzen (256998) on Monday July 28, 2014 @04:00PM (#47552991)

    They're talking about the Score, not the compression algorithm. And your link doesn't mention anything about the Score.

  • by retchdog (1319261) on Monday July 28, 2014 @04:09PM (#47553055) Journal

    it's for lossless compression only.

    anyway, you can just add a term representing the lost information and throw it into this "score". hey, why not? just figure out how important the lossiness is relative to compression rate. if it's very important, take the exp() of the loss metric; if it's unimportant (like time is), take the log(); finally, if it's just kind of important, leave it linear, or maybe square or square root. whatever.

    seriously, just make some shit up and throw it in. you won't compromise anything. it's already just made-up shit.

  • Re:Bullshit.... (Score:5, Informative)

    by mrchaotica (681592) * on Monday July 28, 2014 @06:04PM (#47553795)

    Can you explain in more detail?

    If you have a multi-dimensional set of factors of things and you design a metric to collapse them down into a single dimension, what you're really measuring is a combination of the values of the factors and your weighting of them. Since the "correct" weighting is a matter of opinion and everybody's use-case is different, a single-dimension metric isn't very useful.

    This goes for any situation where you're picking the "best" among a set of choices, not just for compression algorithms, by the way.

    Like, if you're trying to compress a given file, and one algorithm compressed the file by 0.00001% in 14 seconds, another compressed the file 15% in 20 seconds, and the third compressed it 15.1% in 29 hours, then the middle algorithm is probably going to be the most useful one.

    User A is trying to stream stuff that has to have latency less than 15 seconds, so for him the first algorithm is the best. User B is trying to shove the entire contents of Wikipedia into a disc to send on a space probe [wikipedia.org], so for him, the third algorithm is the best.

    You gave a really extreme[ly contrived] example, so in that case you might be able to say that "reasonable" use cases would prefer the middle algorithm. But differences between actual algorithms would not be nearly so extreme.

  • by khellendros1984 (792761) on Monday July 28, 2014 @06:12PM (#47553849) Journal
    FTA:

    And Jerry Gibson, a professor at the University of California at Santa Barbara, says he's going to introduce the metric into two classes this year. For a winter quarter class on information theory, he will ask students to use the score to evaluate lossless compression algorithms. In a spring quarter class on multimedia compression, he will use the score in a similar way, but in this case, because the Weissman Score doesn't consider distortion introduced in lossy compression, he will expect the students to weight that factor as well.

    The scoring method as stated is only useful for evaluating lossless compression. One could also take into account the resemblance of the output to the input to allow a modified version of the score to evaluate lossy compression.

FORTRAN is a good example of a language which is easier to parse using ad hoc techniques. -- D. Gries [What's good about it? Ed.]

Working...