Google To Promote Web Speed On New Dev Site 106
CWmike writes "Google has created a Web site for developers that is focused exclusively on making Web applications, sites and browsers faster. The site will allow developers to submit ideas, suggestions and questions via a discussion forum and by using Google's Moderator tool. Google hopes developers will join it in improving core online technologies such as HTML and TCP/IP. For Google, a prime example of how Web performance can be enhanced is the development of HTML 5, which provides a major improvement in how Web applications process Javascript, Google believes. 'We're hoping the community will spend some time on the basic protocols of the Internet,' Google product manager Richard Rabbat said. 'There's quite a bit of optimization that can be done [in that area].'"
Why Do They Ignore Their Own Advice? (Score:5, Interesting)
HTML - as opposed to XHTML, even when delivered with the MIME type text/html - allows authors to omit certain tags. According to the HTML 4 DTD [w3.org], you can omit the following tags (tags of so-called "void" - empty - elements are marked as strikethrough):
For example, if you have a list of items marked up as <li>List item</li>, you could instead just write <li>List item. Or instead of a paragraph that you'd usually close with via </p>, you could just use <p>My paragraph. This even works with html, head, and body, which are not required in HTML. (Make sure you feel comfortable with this before making it your standard coding practice.)
Omitting optional tags keeps your HTML formally valid, while decreasing your file size. In a typical document, this can mean 5-20 % savings.
Now, my first reaction was simply "that cannot be valid!" But, of course, it is. What I found interesting is that when I looked at the source for that tutorial they themselves are using </li> and </p>. Interesting, huh? You would hope that Google would follow the very advice they are trying to give you.
Some of these suggestions may come at the cost of readability and maintainability. There's something about web pages being nice tidy properly formatted XML documents with proper closing tags that I like.
Re:Why Do They Ignore Their Own Advice? (Score:5, Interesting)
The trouble with web pages is that they are source and 'released binary' all in one file, so if you put comments in (as you always should), and meaningful tag and variable names, then your download gets quite bigger.
What you really need is a system to 'compile' the source pages to something less readable, but significantly smaller - removing comments, replacing the unneeded end tags, shortening the variable names. If that was automated - so your source files were deployed to the server via this translator, then you'd never even know the difference, except your users on low-bandwidth (ie mobile) devices would love you more.
We used a primitive one many years ago, but I don't know if there's any improvements to the state of web-page optimisers today.
Re: (Score:2, Insightful)
Come on now. The price of downloading html and javascript source is peanuts compared to images and flash animations. The solution is better web design, not another layer of complexity in the process. There is no shortage of low-hanging fruit to be picked here. Metric tons, you could say.
Re:Why Do They Ignore Their Own Advice? (Score:5, Informative)
> The price of downloading html and javascript source is peanuts compared to images and
> flash animations
That may or may not be true... Last I checked, a number of popular portal sites (things like cnn.com) included scripts totaling several hundred kilobytes as part of the page. The problem is that unlike images those scripts prevent the browser from doing certain things while the script is downloading (because you never know when that 200kb script you're waiting on will decide to do a document.write and compeletely change what you're supposed to do with all the HTML that follows it). So the cost of downloading scripts is _very_ palpable...
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Sure, but a lot of sites set the cache expiration time pretty low so they can all roll out updates often. In the best case, that just means a single conditional GET and 304 response, which isn't too bad.
But I think the one of the main ideas being discussed here is optimizing initial pageload of the site, and the cache doesn't help with that. If it's an SSL site and doesn't explicitly allow persistent caching of its data, it doesn't even help across browser restarts.
Re: (Score:2)
The problem is that unlike images those scripts prevent the browser from doing certain things while the script is downloading (because you never know when that 200kb script you're waiting on will decide to do a document.write and compeletely change what you're supposed to do with all the HTML that follows it). So the cost of downloading scripts is _very_ palpable...
All the more reason to avoid document.write and use JavaScript with the DOM to update the content of your pages instead.
Re: (Score:2)
Sure but remotely included scripts are cached, so the page loads slower only on the first load.
Is that really such a big deal?
Re:Why Do They Ignore Their Own Advice? (Score:5, Informative)
What you really need is a system to 'compile' the source pages to something less readable, but significantly smaller - removing comments, replacing the unneeded end tags, shortening the variable names. If that was automated...
Something like gzip compression [apache.org] perhaps?
Re: (Score:2)
Re: (Score:1)
Re: (Score:3, Insightful)
If you save 320 bytes per file, serving 200 different files 750,000 times per day each (imagine some HTML docs that load a bunch of images, JavaScript, and CSS), that's 1.3TB over the course of 30 days. It adds up fast.
320 was chosen out of the air, as the total length of removed JavaScript comments (320 bytes is the content of 2 full SMS messages), trimmed image pixels, or extraneous tabs in an HTML document. Of course some files will see more page hits than others, some days will see less traffic on the s
Re:Why Do They Ignore Their Own Advice? (Score:4, Insightful)
The problem with gzip compression (in this case) is that its not lossy. All of the "unnecessary" things that you have (e.g. the unneeded closing tags on some elements) will still be there when you decompress the transmitted data. I think the grandparent wants a compression algorithm that's "intelligently lossy"; in other words, smart enough to strip off all the unneeded data (comments, extra tags, etc.) and then gzip the result for additional savings.
Re: (Score:2)
Sounds like you're talking about server-side HTMLTidy. Jesus, how are you supposed to troubleshoot if your page doesn't publish/render the same way as you develop it? I guess "LAZY" is the answer. If that was a good idea I think the W3C would've mandated it with HTML 3.0.
Turn it off in your dev environment until you're ready to debug issues that come up with it (i.e. after you feel everything is ready otherwise). Sure it's an extra cycle of development, but if HTMLTidy (or whatever you use) isn't doing something really weird, everything should work exactly the same as it does without it being turned on.
A 15k savings per page load on a site that gets 15 million hits per day = 429.15GB less traffic per month. How much do you pay per GB of traffic? Would this be worth it? What
Re: (Score:2)
Don't forget, those "unneeded" closing tags are needed in HTML 5. The days of newline tag closings are numbered.
Re: (Score:2)
New lines have never been a substitute for a closing tag in HTML. Context, such as starting another <p> before formally closing the previous one, has. (Paragraphs, by the specification, cannot contain other block-level elements, including other paragraphs. The specifications allow for the omission of certain elements when other parts of the specification preclude ambiguity.)
Of course, most authors would put a new line in their code at this point for readability, but that's another story.
Re: (Score:2)
Closing tags like li are going to compress down nicely with gzip if there are enough to take up lots of space.
I suspect that any kind of HTMLTidy approach on web pages is not going to be very successful at saving space, compared to something like gzip, or even in addition to it. For example leaving out end tags on lists won't save much space at all if those are already stored with a small token after compression, being so common. It's kind of like compressing a file twice - you're not going to see massive
Re: (Score:2)
While you're absolutely right (there is no excuse to not support gzip compression on your web server these days), a file loaded with comments and unnecessary whitespace is still going to compress down to a larger size than one with all the comments, out-of-tag whitespace removed. There is simply less data to compress in the first place. (Note: things such as long CSS ids are of no matter, because they'll be pattern-matched and take up the same space as a shorter name, anyway)
Re: (Score:2)
Re: (Score:3, Informative)
This is what's called mod_deflate on Apache 2
I'm using it on a couple small sites I maintain. The text portions of those sites get compressed to less than 50% of their original site. Obviously it does not compress images, pdfs,...
Obviously there is no need to compress those as they are already properly prepared before they are available online.
Re: (Score:2)
A lot of the elision allowed was for human beings writing html code by hand. Interpreting it in a DFA without the extra closing tags takes no more cycles at run-time, but more work writing the interpreter.
Logically, the only thing you're gaining by leaving out tags is the time to read , so this part of the optimization isn't going to give you a lot of performance.
--dave
Re: (Score:2)
Well, all that would be unnecessary if server-side gzip were turned on. I consider that a type of web page optimization, and you don't really have to anything special with the HTML.
I believe there is a case to be made for compression even for very dynamic websites. It works very well for mobile devices like Blackberries.
Re: (Score:2)
I have a script that pulls the comments out of my html, css, and js files before uploading them to the server for this reason entirely.
For simple (read: small) it's not a huge problem (adds 1k or so) but it can become a problem for larger pages. The repositories for the files contain all the comments needed to develop and maintain the code, but the pages that are actually viewed by the end-user don't. As much as the inquisitive end-user may like to have commented html/css/js to look at, it's much more pract
Re: (Score:2)
You hardly need comments if you write clean HTML. Most of the complicated stuff that makes the web slow is the super convoluted javascript and flash garbage that is mostly intended to hamper users from accessing content. The sort of people/companies that produce these sites aren't really concerned about their vistors' convenience. They're interested in controlling and monitoring their visitors. I'm having trouble believing these people care much about how slow and miserable their sites are.
If you're one of
Re: (Score:1)
What you really need is a system to 'compile' the source pages to something less readable, but significantly smaller - removing comments, replacing the unneeded end tags, shortening the variable names. If that was automated - so your source files were deployed to the server via this translator, then you'd never even know the difference, except your users on low-bandwidth (ie mobile) devices would love you more.
We used a primitive one many years ago, but I don't know if there's any improvements to the state of web-page optimisers today.
The Aptimize website accelerator ( www.aptimize.com [aptimize.com]) does exactly this, except its implemented as an output filter on your webserver so content is dynamically optimized as it is sent to the user, obviating the need for a separate optimizing deployment process. It does things like combining css and javascript references on a page, inlining css background images, and combining images into css mosaics to reduce request counts, minifying css and javascript files, and adding proper cache headers to all page reso
Re: (Score:2)
Re: (Score:2)
Well, it still is a verbosity joke.
I internally use a format that is derived from EBML. Matroska's internal generic binary markup format.
I simply added a mapping header, that maps tag names and parameter names to the tag ids.
That way I can easily convert, and edit the files, with any text editor, and transform from XML and back without any hassle at all.
It's just like ASCII is a mapping of numbers to characters. Just on one level higher.
It's nearly too simple an obvious. So I think it should be come a new s
Re: (Score:1)
You can do Interesting Things with HTML and tag and attribute minimization. This is a valid web page:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"><title//<p//
I wrote about this [aamcf.co.uk] a few years ago.
Re: (Score:2)
Yes, but not closing your tags is not being xhtml complient, and google has an image to upkeep!
They show off what they know, but they want to remain politically correct.
Comment removed (Score:3, Interesting)
Re: (Score:2)
Gah!
Say it with me: "Use prepared SQL queries not concatenation!"
Their video is dynamically building the SQL statement, which is full of injection possibilities.
Re: (Score:2, Informative)
Re: (Score:2)
An ideal solution would be some way to store the prepared version of a query on the server.
Re: (Score:2)
I don't know about MySQL, but prepared statements in most major RDBMs will allow the database to cache the query plan as well as being more easily optimized. So they actually are much -faster- if you need to execute the query over and over (especially if you can reuse the same query object). Many database APIs will also let you use statement objects that have the same capabilities as prepared statements in term of query plan caching and safety, but do not do the first roundtrip to optimize the query on the
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:3, Interesting)
As for using commas with echo, why aren't you using a template engine?
Re: (Score:3, Insightful)
From TFA:
Sometimes PHP novices attempt to make their code "cleaner" by copying predefined variables to variables with shorter names. What this actually results in is doubled memory consumption, and therefore, slow scripts.
It seems to me that this is a flaw in the PHP interpreter, not the PHP programmer. The way I see it, the interpreter should be lazily copying data in this case. In other words, the "copy" should be a pointer to the original variable until the script calls for the copy to be changed. At that point the variable should be copied and changed. I believe this is how Python handles assignments, and I'm surprised that PHP does not do it the same way.
Re: (Score:1, Informative)
It does.
From Chapter 2.3.4. Garbage Collection in Orelly'S Programming PHP:
PHP uses reference counting and copy-on-write to manage memory. Copy-on-write ensures that memory isn't wasted when you copy values between variables, and reference counting ensures that memory is returned to the operating system when it is no longer needed.
To understand memory management in PHP, you must first understand the idea of a symbol table . There are two parts to a variable--its name (e.g., $name), and its value (e.g., "Fre
Re: (Score:1)
For the record, Google's very first claim (about copying variables doubling memory consumption) is dubious. Although the language itself passes most variable types by value, the PHP interpreter copies variables lazily using copy-on-write [wikipedia.org]. So in general, "aliasing" variables for readability should be OK. However, I'm not sure whether the interpreter copies individual array elements lazily...
C'mon slashdot, get working (Score:2, Funny)
We've got to slashdot their site for ultimate irony! :)
Re: (Score:2)
You're about half a decade too late, sorry. Slashdot hasn't been able to slashdot anything beyond the puniest tin-can-and-string servers for a long time.
Re: (Score:2)
You want to slashdot a google.com site?
I think your sense of scale might be a bit off here, but good luck with that anyhow :)
Start by eliminating the zero bits (Score:2, Funny)
The skinnier ones compress much easier.
Re: (Score:2)
Re:Start by eliminating the zero bits (Score:4, Funny)
Yes, the 1s are skinny and the 0s are fat. You see, there is more space to compress between a line of evenly spaced 1s than between a line of evenly spaced 0s. If you compress wth too much force, the 0s get "squished", they'll turn into 1s, and this can screw up the formatting and cause segfaults and kernel panics, even in the newer Linux builds. There isn't much that can be done about this, even with today's protected memory designs, so we're limited to removing the space in between. It might help you to think of this technique as the giant laser thingie in "Honey, I Shrunk The Kids!" which recuded the space between atoms of objects in order to shrink them.
ROR compression (a variation of the .rar format) uses this particular method, replacing the 0s with a counter of how many 0s in a row where replaced, and then compressing the 1s together. This is called "Packed Binary Coding".
Similar methods where developed by American researchers (Dynamic Unicode Hypertensioning), but instead of simply compressing the 1s, they are instead converted into a pipe character ("|") so as to prevent the tick mark adorning the shaft of the 1 to prevent further compression (or errors resulting from "tilt" when the ones are pressed together too forcefully).
These are second-year Comp Sci concepts. What is /. coming too when we're not even keeping up with the basics? It's a sad day for geeks everywhere.
Re: (Score:2)
All this is pretty obvious though... but experts in the field such as myself know that for the best compression, you need to use a san-serif font. All the serifs on the ends can otherwise take up extra space so you can fit less in a packet.
The other curious thing about this is that by using 1's instead of 0's, you get better compression by using more bits. But if you find you actually need to us
Revolutionary idea (Score:5, Funny)
Re: (Score:2)
You mean what Microsoft has been doing since 1990-something?
Re: (Score:2, Insightful)
WHOOSH
Re: (Score:1)
No one said anything about it not being connected to the internet.
It's a plague. (Score:1)
Re: (Score:3, Insightful)
Technology is advancing (i think i read somewhere there that JS processing is 100x faster in modern browsers) and there are a lot of develo
Those Google engineers sure are a sexy bunch! (Score:1, Funny)
Why oh why was this in video format?
good idea (Score:3, Funny)
As any open source developer knows, what's needed is more ideas, suggestions, and questions. Later, once the discussion group has come to consensus, we'll write some code.
WebSpeed? (Score:2)
Re: (Score:2)
Considering we use Progress, I was thinking the same thing...
Just write a native client-side app (Score:2, Interesting)
Yes I realise that for n00bs its all about the convenience of web apps but client-side apps need not be inconvenient. Look at the iPhone app store, n00bs love it and its full of client-server applications. If there was something like it for Windows and OS X we'd never need to work
Re: (Score:2)
HTML/HTTP were never designed as a method for running remote applications and shouldn't be used as such.
Developers use the best tool for the job and (sadly) Web apps are more functional and useful to people than native clients in many instances.
Yes I realise that for n00bs its all about the convenience of web apps but client-side apps need not be inconvenient. Look at the iPhone app store, n00bs love it and its full of client-server applications.
This is part of an interesting shift in computer technology. Mainly it is the shift to portable computing owned by the user. This contrasts with work provided computers controlled by them and public terminals. When you want to check your personal e-mail at work, using your work provided desktop, well a Web application is really convenient. When you want to check your p
Re: (Score:2)
HTML/HTTP were never designed as a method for running remote applications and shouldn't be used as such. We spent all these years upgrading to the latest Core 2 Trio so we could make the internet connection the new bottleneck.
Well yeah. It was designed to serve content but to downplay server side content is to discount the whole reason PHP, CGI, and ASP was made.
There is a dramatic need for web hosts and web developers to control the platform in which your application will run. Your only alternative is to c
Stop using off-site crap (Score:2)
Like Google Analytics, or Google Ads. When Google went pear-shaped some time back it made a significant portion of the Web unusable. If your own server is down, no big deal. If other sites depend on your server, then it's a problem.
While I'm slagging off Google, why don't they stop Doing Cool New Stuff and improve their fucking search engine instead?
Yahoo has a good page, too (Score:2, Informative)
Yahoo! has a handy page (http://developer.yahoo.com/performance/ [yahoo.com]) with lots of good info. It includes YSlow (a Firefox add-on), a set of "Best Practices," and some good research. Also references a couple of O'Reilly books (which, to be fair, I haven't read).
More specifically, CSS sprites (see http://www.alistapart.com/articles/sprites/ [alistapart.com]) and consolidating Javascript may be back (reducing HTTP requests), and a few other things that may surprise or inform.
Re: (Score:2, Interesting)
I am honestly torn on the idea of CSS sprites. While yes, they do decrease the number of HTTP requests, they increase the complexity of maintaining the site. Recently, Vladimir VukiÄeviÄ pointed out how a CSS sprite could use up to 75MB of RAM to display [vlad1.com]. One could argue that a 1299x15,000 PNG is quite a pain, but in my experience sprites end up being pretty damned wide (or long) if you have images that will need to be repeated or are using a faux columns technique.
Some times it gets to be a bette
external resources in HTML pages (Score:4, Insightful)
The number one slowdown I see on pages is linking to all kinds of external resources: images, flash movies, iframes, CSS, bits of javascript. Each of these requires at least another DNS lookup and a new HTTP connection, and often those external servers take a really long time to respond (because they're busy doing the same for all those other websites using them). Why is this going on in each users browser? It should all be done behind the scenes on the web server. Why would you put the basic user experience of your users or customers in the hands of random partners who are also doing the same for competing sites? It takes some load off your server, but I think the real reason that people just link in external resources as images, objects, etc is just that it's easier than implementing it in the back end. If you really want to offload work, then design a mechanism that addresses that need specifically.
We've ended up with a broken idea of what a web server is. Because it was the easiest way to get started, we now seem to be stuck with the basic idea that a web server is something that maps request URLs directly to files on the server's hard disk that are either returned as is or executed as scripts. This needs to change (and it is a little bit, as those "CGI scripts" have now evolved into scripts which are using real web app frameworks.)
Re: (Score:2)
Opera Unite - resourcefetcher.js (Score:1, Interesting)
I was having a look over Opera Unite services when looking to write one of my own, and i noticed this handy little function.
It fetches all the external page objects after the initial page has loaded.
Sadly, the example (homepage) failed in the sense that the basic CSS was not always the first thing to be loaded, which resulted in buckled pages on tests for slow upload speeds. (and some things weren't obviously clickable objects before images were loaded in)
So, in this way, an initial page could be loaded tha
Some very slow sites: Slashdot and Facebook (Score:3, Interesting)
More and more, sites are generating the message "A script on this page is running too slowly" from Firefox. Not because the site is hung; just because it's insanely slow. Slashdot is one of the worst offenders. The problem seems to be in ad code; Slashdot has some convoluted Javascript for loading Google text ads. Anyway, hitting "cancel" when Slashdot generates that message doesn't hurt anything that matters.
Facebook is even worse. Facebook's "send message" message composition box is so slow that CPU usage goes to 100% when typing in a message. Open a CPU monitor window and try it. I've been trying to figure out what's going on, but the Javascript loads more Javascript which loads more Javascript, and I don't want to spend the debugger time to figure it out.
Re: (Score:2)
Facebook needs to step back and optimize, optimize, optimize. They're well ahead of MySpace, and with the reputation MySpace is getting, Facebook would do well to keep things clean and fast; there isn't really a danger of competitor innovation destroying them (in the short term).
Re: (Score:3, Interesting)
Documents (3 files) 7 KB (592 KB uncompressed)
Images (111 files) 215 KB
Objects (1 file) 701 bytes
Scripts (27 files) 321 KB (1102 KB uncompressed)
Style Sheets (12 files) 69 KB (303 KB uncompressed)
Total 613 KB (2213 KB uncompressed)
So is this new Google initiative... (Score:2)
...available to Google developers? Because some of the slowest applications on the planet are Google apps: The gmail and adwords applications come immediately to mind.
I think it's somewhat disingenuous to imply that slow web interfaces are someone else's problem when in fact Google is probably one of the worst perpetrators when it comes to slow interfaces.
Yslow vs. Speed (Score:3, Informative)
For those who are into web site performance, like me, the standard tool for everyone was Yslow [yahoo.com], which is a Firefox extension that measured front end (browser) page loading speed, assigned a score to your site/page and then gave a set of recommendations on improving the user experience.
Now Google has the similar Page speed [google.com] Firefox extension.
However, when I tried it, with 5+ windows and 100+ tabs open, Firefox kept eating away memory, and then the laptop swapped and swapped and I had to kill Firefox, and go in its configuration files by hand and disable Page Speed. I have Yslow on the same configuration with no ill effects.
PHP advice legitimity (Score:1)
Don't copy variables for no reason.
Sometimes PHP novices attempt to make their code "cleaner" by copying predefined variables to variables with shorter names. What this actually results in is doubled memory consumption, and therefore, slow scripts. In the following example, imagine if a malicious user had inserted 512KB worth of characters into a textarea field. This would result in 1MB of memory being used!
BAD:
$description = $_POST['description'];
echo $description;
GOOD:
echo $_POST['description'];
Now I would never question the almighty Google, but the Rasmus Lerdorf taught me that PHP uses copy-on-write. Quoting from his O'Reilly Programming PHP book:
When you copy a value from one variable to another, PHP doesn't get more memory for a copy of the value. Instead, it updates the symbol table to say "both of these variables are names for the same chunk of memory."
So who's right? I tend to believe M. Lerdorf since he pretty much invented PHP but like I said before I'm not an expert and my book is pretty old so (PHP 4.1.0) so maybe that has changed since (although I doubt it)...
Re: (Score:1)
In that particular case, it just might. If the code were to later modify $description it would require a whole new entry in the symbol table. Then it would be using up twice the memory.
I'm pretty sure that PHP also has some pretty slick automatic unloading too though, so it might look ahead, see that you don't use $_POST['description'] again and promptly drop it out of memory (FYI: I am in no way advocating the "fugheddaboudit" approach to memor
Re: (Score:1)
Re: (Score:1)
Unless I'm wrong -- and I could be -- compression is usually less effective on small payloads, in some cases even making the payload bigger. POSTs might be big, but GETs usually aren't. Compressing that won't help you a lot.
new (old) file formats still needed (Score:2)
Whatever happened to JPEG2000? (Patent problems?)
SVG should've been here long ago. IE will continue to slow SVG adoption in the real world.
If we could get JPEG2000 (or something like it) and SVG in 95+% of browsers, I think we'd be golden. That and getting rid of IE6 with its broken box model (among many other problems), would go a long way towards modernizing the Web. Take HTML5 and add the top 10 features or so of CSS3, and it's party-time for web devs once again. MS needs to shit or get off the pot with
Re: (Score:1)
Be nice to dig in. (Score:1)
Im excited about this, been working on a site Impostor Magazine and been using flex and different tools... nice to have a place to play and test.
Why do XML closing tags contain the tag name? (Score:3, Interesting)
<html>
__<head>
____<title>Example</>
__</>
__<body>
____<h1>Example</>
____<p><em>This <strong>is</> an</> example.</>
__</>
</>
I know this makes it hard for a human to see opening/closing tags, but if XML parsers (including those in browsers) were able to accept markup with short close tags or the normal named close tags, then we could: 1. benefit where the markup is machine generated and, 2. easily pre-process manually created markup.... it's easy enough to convert back and forth.
But maybe there's a good reason for not doing this that I'm missing... but it's always bothered me!
Re: (Score:2)
Seems like ambiguity is a problem. Is this:
The same as this?
Re: (Score:1)
Always in favor of optimization (Score:1)
<sarcasm>Why bother finding a more efficient way to do [whatever] when you're talking microseconds at the user's end?</sarcasm>
I'm actually sort of surprised a glut of bandwidth and server power hasn't led to a similar "kitchen sink" approach to web technol
Before speed we need Mail to be fixed (Score:1)
I myself waist the most time on the Internet with Spam. So lets fix the most annoying things first.
Start signing your e-mail so I can filter on a reliable source where it comes from and use the web of trust to indicate if I have any 'remote' trust relationship to the sender.