## 1st International Longest Tweet Results 44

Dr_Evil6_6_6 writes

*"Slashdot had a story about the 1st International Longest Tweet Contest last month, and the winners have just been announced."*The winner is impressive.
Note: You can take 10% off all Slashdot Deals with coupon code "slashdot10off."
×

This discussion has been archived.
No new comments can be posted.

Life's Building Blocks Found On Asteroid 24 Themis

135 commentsARM-Based Servers Coming In 2011

253 commentsThe most difficult thing in the world is to know how to do a thing and to watch someone else doing it wrong, without commenting. -- T.H. White

## Re: (Score:2)

"It's kind of in bad web etiquette to ninja that entire post from Ksplice."Actually AC it's very common on

## Re: (Score:2)

The maths at the top refer to the number of unique tweet-able messages. If there are 2^4339 total unique tweet-able messages then that means there are 4339 bits available.

## Re: (Score:3, Informative)

For those wondering of a better way.

//output lowest 31bits of our 4339bit block of data // Shift down

for( i = 4339; i > 0; i-=31) {

output((wxchar)(bigInt & 0xef));

bigInt = bigInt >> 31;

}

Reverse

// add the 31 bits to the current bitInt // Shift up

while( curInput = input() ) {

bigInt += curInput;

bigInt = bigInt 31;

}

## Re: (Score:2)

Your encoder is encoding the original 31-bit "words" from right to left, but your decode is decoding the original 31-bit "words" from left to right

## Re: (Score:1)

## Re: (Score:2)

1) Every "character" in Twitter, in this algorithm, can store one of 2146369536 values, since the other 1114112 run into problems.

2) 2^4339 2146369596^140

3) Thus, any 4339-bit integer in base 2 can be converted into a 140-bit integer in base 2146369596. To encode, do this conversion. To decode, do it backwards.

## Re: (Score:2)

Sorry, Slashdot removed my mathematical symbols. Number 2 should read "2^4339 is less than 2146369596^140"

## Re: (Score:2)

Sorry, Slashdot removed my mathematical symbols.

We should have a 1st International Longest Slashdot Post competition. Same rules as the Twitter competition, except you have to deal with Slashdot's draconian input stripping

## Re: (Score:2)

It would make far more sense to first compress the data (LZW [wikipedia.org] for example)and then encode with CRT. That would give you about a 4600 4.5K ZIP file you could send. With typical 85% compression on English language files, that means the resulting output could be about 30K in length.

## Re: (Score:2)

The contest required a scheme that would work for arbitrary data. Compressing random data with LZW can result in a file that's larger than the input.

## Looking at that entry (Score:3, Insightful)

If they ask what can be arbitrarily stored in the 4339bits available then there you can store 4339 arbitrary bits. It's a rule of compression. If they are asking for an English language compression program there are plenty better out there. Also if the goal is compression of English text and they aren't including the program size in the tweet then the competition can easily be cheated using a dictionary in the program that can be looked up.

At the winner it's not a particularly good compression algorithm. It doesn't even seem to take bayesian probability of characters into account. I can't see any arithmetic coding (mathematically the perfect entropy encoder) either.

## Re: (Score:3, Interesting)

that entered the competition. I find it interesting that the TFA doesn't mention how many submissions were made.## Re:Looking at that entry (Score:4, Insightful)

You don't get it?

A weak, inexplicable imitation of earlier, better tech?

That's Twitter in a nutshell.

## Re:Looking at that entry (Score:5, Informative)

4339 bits is 542 bytes plus three spare bits, so if you wanted to actually use this for something you could use those three bits to define your data format from one of eight types, then "attach" your data payload to the header to generate the sequence of 4339 bits. Some ideas for the payload would be:

## Re: (Score:2)

There are 2^4339 available bits in a valid tweet so the first algorithm takes any 2^4339 bit sequence and converts it into a valid tweet, the second converts it back again.

That's one heck of a compression algorithm! You could fit the entire internet in a single tweet! I think you're on to something, where can I invest my money?

## Re: (Score:3, Informative)

universesin 2^4339 bits, and probably do so several times over as well, let alone the entire Internet.## Re: (Score:2)

Is 2^4339 bits actually 1337 code granules? c00lz!

## Re: (Score:2)

Or, 512 bytes plus pointers leading to next/previous "sectors" of data as metadata.

Now you're able to store an arbitrary file, and all you really need to know is the ID of the beginning. Or one of the pieces and you can then recover the file.

Sounds like a great way to store and spread files - TwitterShare! Like Rapidshare, but without the suck. And let the MPAA/RIAA battle it out with all the users.

## Re: (Score:2)

Or, you could display your 4339 bit number in base 36 to encode non-beautiful alphanumeric messages. (about 813 characters, but no capitalization, punctuation, nor whitespace)

ITA2 [wikipedia.org] (5 bit) does even better for some messages. 867 characters, but you lose some when you shift between modes (letters vs numbers/punctuation).

## Re: (Score:2)

I should note that yes, I know that doesn't leverage compression, and compression algorithms will do better.

## Re: (Score:2)

The contest is to figure out a way to make more bits available.

It is not obvious that Twitter messages are always guaranteed to carry 4339 bits of information (which is why the original post announcing the contest offers only 4200 bits).

Any attempt to use "compression" as we usually understand it would be pointless because you can't always fit x bits of arbitrary data in an x-1 bit channel.

If it makes you feel any better, a lot of commenters didn't get it, either.

## Re: (Score:2)

No, they're asking how many bits you can get out of a 140 character tweet regardless of what you can encode into those bits. If they just type ASCII, that's 140 * 7 bits = 980 bits of info (I'm ignoring nonprintables for the sake of argument). Yeah, you could build a compression scheme on top of those 980 bits, but that's now what the competition is about. Through the use of Unicode characters and other tricks, someone managed to get over 4339 bits, averaging 31 bits per "character".

## Erm ... (Score:5, Insightful)

Except for the fact the algorithms he has submitted have NOTHING to do with compression, and are just a method of mapping the 4339 bits into the allowable Unicode character set over 140 x 32 bit character "slots", i.e. encoding / decoding only.

With 4339 bits, hell in theory the longest actual tweet you could make is 2^4339 of any single character you choose, using the 4339 bits just to represent a (very large) counter of how many times to repeat the character.

Considering that 2^4339 is approximately 10^1305, and there are probably only 10^82 atoms in the whole universe, that's one bloody long tweet.

## Re: (Score:2)

Nah, you can do far better than that. "The character 'a' repeated Graham's number of times" is just a start...

## Re: (Score:2)

Nah, you can do far better than that. "The character 'a' repeated Graham's number of times" is just a start...

But the Kolmogorov Complexity [wikipedia.org] of that is rather smaller. It's

thatwhich is limited by Twitter, not the eventual expanded size of the message.## Re: (Score:3, Insightful)

Yes, but that's not what GPP was talking about. Why on Earth would you assume that comments on /. would be on-topic, when that would require reading TFS? ;)

## Re: (Score:2, Insightful)

## Re: (Score:2)

Yes, but what if you put down

two coats of paint? Hmm? What do you think ofthat, Mr. Smartypants?## Re: (Score:2)

Depends on your definition of "arbitrary".

I was merely describing a very simple compression system, which does in fact define up to 2^4339 distinct messages. The fact that all the messages are composed of a single character does not detract from the theoretical maximum number of possibilities.

Of course, as your dictionary (character set) increases, then the number of your "arbitrary" messages becomes a lot less than this ...

If your character set is limited to the uppercase A-Z and a space (enough to transmi

## I guess I could get Zip Quine in there... (Score:2)

http://www.steike.com/code/useless/zip-file-quine/ [steike.com] ...infinite compression.

## Limits (Score:3, Funny)

## 16000bits (Score:4, Funny)

Solution - Just tweet the following picture of a swimming fish:

".`.`..`.>"

Given that 1 word is 16 bits, and a picture is equal to 1,000 words, :-)

that makes my above tweet 16,000 bits of information (fitting

several pictures in a tweet may extend this further)

(.)(.)

(.Y.)

d^_^b

48000bits!

## A report from 4chan (Score:1)

## The FIRST International Longest Tweet Contest? (Score:2)

validunicode characters as well. They just avoided them (like the contest bloggers) because they weren't sure that there wasn't some arbitrary string of characters that would mess up the message.I suppose that the contest could continue on that basis alone: how many more bits can you encode by using t