Genetic Algorithm Improves Shellsort 71
gstover writes "As a personal programming project recently, I used a genetic algorithm to find a better sequence of increments for Shellsort. It found at least one sequence that out-performs the best previously known by about 5 percent. Here
is an article about it (also here).
I believe this is the first time a
genetic algorithm has actually improved Shellsort."
Send it to DKnuth (Score:3, Funny)
Mark
Impressive (Score:2, Interesting)
I could imagine your shell sort could be applied to database technologies by creating a separate low priority thread that would generate the "gene pool" to improve the indexing of the tables in the background.
On a side note I am learning about genetic algorithms with hopes that I may work on it when I take my Masters in Computer Science.
Re:Impressive (Score:3, Informative)
Quoting from your own link: "Invented by Donald Shell in 1959, the shell sort is the most efficient of the O(n2) class of sorting algorithms. Of course, the shell sort is also the most complex of the O(n2) algorithms."
Which part of "shell sort is the most efficient of the O(n2) class of sorting algorithms" do you not understand?
Of course, why speed up an O(n2) sort - why not try to speed up an O(n log n) sort like quicksort? try and figure out the best way to choose the pivot point!
Re:Impressive (Score:4, Interesting)
An obvious extension to generic lengths is to use these precomputed networks as recursion base cases for quicksort, instead of switching to selection sort for lengths x (x ~ 5 typically).
IIRC shellsort works similarly: recursively sorting subsequences and merging results.
Re:Impressive (Score:2, Informative)
You should not look at genetic algorithms, but perhaps upon a newer "family" known as PSO (Particle Swarm Optimization). It's obviously very similar to GAs but takes care of cooperation as part of the development of solutions. This is the "weak" point of GAs as they only work through "battling" other solutions.
Swarms are a concept which was "invited" in 1995 I believe. Read some articles about it.
Historic moment! (Score:2)
Re:Historic moment! (Score:1)
Re:Historic moment! (Score:2)
Nice and all (Score:1)
Re:Nice and all (Score:2, Informative)
Re:Nice and all (Score:3, Informative)
Theta(n log(n)) instead of Theta(n^2) like the (standard) Quicksort.
(Shellsort has Theta(n log(n)^2), IIRC)
Re:Nice and all (Score:3, Informative)
It depends on what you're measuring.
While Quicksort is O(nlogn) in typical use, its worst case performance is O(n), although most serious implementations would use a variation such as Introsort to limit the worst case behaviour.
Mergesort is O(nlogn) in all cases. However, the constant factor is usually significantly greater than Quicksort, making the latter preferable for most uses.
Shell sort is significantly slower for most situations, though IIRC it's pretty good with data that's already almost sorted.
Re:Nice and all (Score:4, Interesting)
There may be some cases where shellsort is more desirable for the exact data being sorted, I don't really know for sure. The importance of this is that he has used a GA to better the optimization work of humans on shellsort. He has laid the groundwork and circumstantial proof out for others to do the same with other algorithms. Of course he evolved a set of constants more than an algorithm itself.
The next logical place to go with this work, IMHO:
1) Invent a concise fake machine language for sorting algorithms (a convenience to make the rest of this easier). It should have basic instructions used in manipulating and sorting arrays (move, compare, copy, branching, etc...).
2) Write a "sort engine" that takes as input algorithms written in your fake language and uses them to sort things (outputting some performance numbers).
3) Implement every known array sorting algorithm you can find in your little fake machine sort language.
4) Let a GA evolve the whole algorithm by arbitrarily replacing bytes with other valid bytes from your specialized assembler language, starting with all the known sort algs as parents. Let it run until it stabilizes, using a relatively high mutation rate.
Of course, the big problem is that if your language implements any kind of looping construct, or any other way that code can be re-executed (and it will almost have to), then you face a "halting problem" when testing each new child. The pratical workaround is of course to know that any reasonable algorithm must finish the sort in a certain bounded amount of cpu cycles, and terminate any children who take longer.
5) Translate the winning candidate(s) custom machien source back into a generic language like C, and puzzle over exactly why it works so damn well.
Re:Nice and all (Score:3, Interesting)
Re:Nice and all (Score:3, Interesting)
Re:Nice and all (Score:3, Interesting)
There are some inherit disadvantages of GP, while being much more powerful than GA. One of them being the problem of code growth, usually after some generations, the organisms gets to be too large and starting to cosume too much memory and cpu resources to be feasible. One of the papers I wrote while taking Evolutionary Computation class while in college was the investigation and a 'solution/improvement' to the code growth problem.
Re:Nice and all (Score:2, Informative)
Re:Nice and all (Score:4, Interesting)
The big problem with GA though, IIRC, is that the resulting solution is often incomprehensible to a human. I believe Bill Joy did some work with GAs and had comments along those lines (sorry, I couldn't find the quote). Consider for a moment though trying to troubleshoot code generated by a computer. Bad variable names would be just the start of your problems. The logic patterns employed would be essentially random to a human. Many of the patterns would be vestigial and wouldn't even be relevant, but you wouldn't even know that. Identifying the primary execution paths would be a huge chore... never mind actually understanding the basis for why the generated solution actually works.
How comfortable would you be deploying a solution (hardware or software) where the fundamental design isn't even understood? How the heck do you fix such a thing once it's deployed?
Re:Nice and all (Score:4, Funny)
I'm just happy my parents didn't have the same concerns when they "deployed" me!
Re:Nice and all (Score:1)
In this case it does not matter much. The method used to sort data is not likely to be visible in the design.
Any program that lives through multiple maintainers (esp. any interns) likely has code that no one understands. Layers of compatability requirements mean you can never change the softwares behaviour, but no one can figure out everything the software is doing.
Ever see modules with comments like:
if (someglobal->_pt->getTarget()->foo() == MAGIC_COOKIE_42 + 7)
Maybe with some inconsistently applied attempts at a naming convention thrown in to repel all attempts to figure out what the code is doing.
Re:Nice and all (Score:1)
This sort of (non)design is often found in real code written by humans. Only difference is that the 'generated solution' doesn't usually work too well. I have many years exp fixing such things once deployed :-(
Re:Nice and all (Score:1)
--I dunno, it seems to work for MS...
Re: Nice and all (Score:1)
> The big problem with GA though, IIRC, is that the resulting solution is often incomprehensible to a human.
Same with perl, but people use it anyway.
Re:Nice and all (Score:2)
- A lot of simplification of a program can be automated. Some of that simplification work will require significant work to implement, but it's doable. Other parts of the simplification can be done by changing the fitness function once you have an efficient algorithm: In addition to performance or whatever criteria used, you add measurements for program size and complexity as secondary influences on the fitness.
- Robustness of the language the program is generated in can be strengthened. Typically GP uses custom domain specific languages. This both simplifies understanding, and prevent crashes.
- The experiments can be rerun with additions to the language used that group constructs that are complex but understood in order to attempt to simplify the solutions. The population can be steered towards using these constructs by rewarding them in the fitness function.
All in all, GP is very promising for certain types of problems, but it shouldn't be understood at "press this button and out comes a polished working program", rather as a tool to approach problems that are hard to analyse (especially where there is no 100% "correct" solution, or a correct solution is extremely hard to find, but gradual approximation has the potential of working well enough). GA and GP also has great potential wherever requirements change slowly over time and the software needs to adapt, as a way of preventing manual tweaking.
Imagine a lift control GA, for instance (actually used as an example in Dr Dobbs some time ago), where a near optimal solution could be used as a starting point for the lifts behavior. However "traffic patterns" in a building change - new tenants move in, old ones move out, departments move and create changes in which floors have the heaviest traffic, and which floors the traffic is between, and a GA could quite possibly be used to automatically adapt to such changes.
On a sidenote, before anyone raise concerns over letting a "random" program run a lift, the whole point here is that a GA isn't random. You use a hard coded algorithm that enforce rules for what is acceptable and what not. For a lift, of course, you'd have low level requirements such as breaking at a specific speed, and high level requirements such as newer changing direction before all the floors requested in the current direction have been visited.
The GA will only adapt the parameters you choose to surrender control of, such as whether or not the lifts should move when not requested (some of them moving to a specific floor for lunch time, for instance), and how long to wait before doing so.
Re:Nice and all (Score:1)
By contrast, shellsort is quite easy and clean to code, and this improvement to this algorithm just uses a better increment sequence, which doesn't complicate things by much.
So, to answer your question, this has a lot of practical value, namely when development time is more valuable than running time.
Re:Nice and all (Score:5, Informative)
You're generalising, which is an easy mistake. Quicksort is good but for many data sets, it is very likely suboptimal and another alogorithm would perform better. Shellsort is good for small data sets. In fact, a good quicksort implementation may use shellsort once the divisions get small enough.
Another example:, quicksort is poor at sorting strings. This is because you have to perform a full strcmp() (or equivalent) at each comparison. For this reason, a modified radix sort is often used.
Re:Nice and all (Score:1)
Radix sort is also not very good at sorting strings. Radix sort requires O(nk) where k equals the number of radix points. If one sorts a list names from the phone book, the keys in this case can have over 30 radix points. Since 2^32 is over 4 billon a quicksort can sort better than a radix sort as long as there are less than 1 billon names to be sorted in this case.
Also strcmp is not that slow over all. Since there are far far more people not named John Smith then there are that are named John Simth, strcmp() will complete the vast majority of it's calls in just a few operations.
Radix sort can not be done in place, thus requiring double the memory to run (will at least for the pointers which for a large list can still be quite a cost). If this addition memory needs to be swapped in and out the performance penalty can be quite nasty.
Radix sort is best used when the number of radix points is quite small such as zip codes (only 5 radix points). In general this only happens when a large number of items have only a few keys. This prevents radix sort from being a good general purpose sort.
Quicksort with a good pivot selection algorithm (to avoid the already sorted list problem) or Mergesort are still used quite often today.
In fact the Java library sort uses a modified mergesort, and the standard C library has qsort().
I did check Knuth about this, and radix sort can get around its problems with a beefy machine. I did some checking and in fact radix is standard for supercomputers. So if you are programming a Cray or whatever they call them now days you can ignore everything I have said.
Re:Nice and all (Score:1)
It's the best. It's impossible to look at fewer characters than what is possible with a radix sort.
Algorithm:
1. Put the strings into piles according to their first letter.
2. Subdivide each pile into further piles according to the second letter.
3. Subdivide each of the new piles according to the third letter.
4. n 5. Collect piles together
You only ever look at each character once! The problem in implementation is keeping track of all the piles, and everything you've said by way of criticism (all perfectly valid) is really criticism of an implemenation not the theory.
If you're interested, this paper [bostic.com] suggests a good, efficient algorithm.
strcmp() itself isn't slow, but when you're checking the same characters over and over again, performance is shattered.
Naive radix sort implementation don't sort in place, but it's certainly possible. The American Flag algorithm described in the paper linked to above just does that. Memory usage is still problematic though, but it's nowhere near double.
They are, and they have their place, but that's not my argument. My point was that it's dangerous to generalise quicksort as a superior sort for everything. eg. shellsort is superior for small data sets.
This applies to radix sort too of course: for very specific input with very specific properties, radix sorting of strings may be weaker than some other algorithm. In fact, the sorting stage algorithm suggested in the original Burrows-Wheeler Transform paper highlights this.
qsort() isn't necessarily a a quicksort implementation (ie. the ANSI standard doesn't require it to be) although it very often is.
Did you ask Knuth personally or did you simply check the Book? That sounds really impertinant I know, but the reason I ask is because most of the research on Radix Sort has been done since the Book was written.
Quicksort vs Timsort (Score:2)
Intro
-----
"This describes an adaptive, stable, natural mergesort, modestly called
timsort (hey, I earned it ). It has supernatural performance on many
kinds of partially ordered arrays (less than lg(N!) comparisons needed, and
as few as N-1), yet as fast as Python's previous highly tuned samplesort
hybrid on random arrays."
Re:Nice and all (Score:3, Insightful)
Shell sort has better worst-case running-time than quick sort does.
Most programming these days is all about the common case, but in real-time programming it's all about predictability. Thus there the mergesort is the standard choice over quicksort.
Re:Nice and all (Score:2)
Quicksort is good for a lot of cases, but not for all. Yet one more good example (not mentioned here yet) would be the presense of knowledge or how unsorted things really are. Meaning that if you know something extra about the "unsorted" state of the data quicksort can be outperformed easily.
Classic example is to arrange bank payments based on the actual pruchase date using the credict card. As each store sends the bill with some variation the bills arrive in time unsorted. On the average they're still almost sorted, but not quite. For such a case quicksort would perform quite horribly in respect to what could be done.
True randomness is far more seldom encountered, thus other means for sorting can also be implemented more often that common people think.
Genetic Algoritm wistfulness? (Score:1)
Re:Genetic Algoritm wistfulness? (Score:2)
Trust me, you do not want to suddenly unleash an efficient algorithm for factorising into large primes on the world. For a start, security as we know it would more-or-less disappear overnight, if only because so much important data has been encrypted using algorithms that assume such factorisation to be hard.
But that is, in fact what I want... (Score:1)
I very much doubt that (Score:2)
I truly and sincerely believe that you should think about that claim, and then think a whole lot harder about what you'd like to see replace it. There are plenty of places in the world, still, where the protections that we take for granted don't exist. I'm guessing you wouldn't like to be there a whole lot, and releasing all the stuff that's currently protected is a one-way trip.
Protections (Score:1)
Re:Protections (Score:2)
You've seen and heard a lot about terrorism lately, I assume. When the intelligence agencies screw up, and something like Sept 11 happens, you hear all about it. How many terrorist incidents do you think such agencies prevent for every one that happens? I'm betting it's quite a few.
I realise that a lot of intelligence officers just read the local papers and use their brains, but I'm guessing that on-the-ground in-with-the-bad-guys people are still pretty important as well. If you blow the cover of every single intelligence officer in the world, who do you think it's going to hurt more, us or the bad guys?
Next up... You have a bank account, right? How would you feel about not only letting anyone who wanted to see the details, but letting any script kiddie with half a brain play with the money itself? If you remove the encryption that's used when things like transfers are done electronically, what exactly is going to stop them?
Now, I don't know whether either of these specific things is a valid example. I kinda assume that military and serious finance types have more than just a couple of prime numbers to protect their data. Then again, screw ups do happen, so maybe it's not so much more. It doesn't really matter anyway. Even if these exact things wouldn't crumble if you released an efficient prime factorising algorithm, there are numerous other things along the same lines that are bound to. RSA is a well-known and widely used algorithm that would be cracked instantly, for a start.
Now, I don't know about you, but personally I rate my personal safety, the security of my finances, the confidentiality of my health records and such pretty highly. Right about now, they're reasonably protected as far as I know. If you release something that cracks such a popular and fundamental encryption technique overnight, that all disappears. Me, I reckon I'd prefer living in the West to living in a third world country where the only people who have any control are the guys with the biggest guns, but maybe I'm just dumb like that...
The entire game changes (Score:1)
Re:The entire game changes (Score:2)
Unfortunately, the sort of society you're describing couldn't come about overnight. Those sorts of changes would take many decades to get right even starting from an organised beginning. In the period that would follow the breakdown of encryption, I suspect what you'd actually see would be a period of panic and relative anarchy, a clampdown into police states to control it, and subsequently many years of corruption in governments desperate to retain control until they could reestablish the mechanisms that were there before using alternative means. The entire game doesn't change at all -- breaking one crypto algorithm isn't anything like enough to catalyse that -- it just gets really ****ty for a while.
well done! (Score:1)
can you break it down for me a bit more? (Score:1)
From the article (Score:4, Funny)
"Notice that the mating process has nothing to do with the constraints placed on valid organisms."
I wish more women would read Slashdot. Chalk one up for the "Size Doesn't Matter" team.
"I chose single-point crossover as the method of mating because it is simple & generic."
I choose doggie style for the same reasons.
Actually this HAS beed done before (Score:4, Informative)
"Faster Shellsort Sequences: A Genetic Algorithm Application", Proceedings of the International Society for Computers and Their Applications(ISCA), April 7-9, 1999. (With Shashidhar Yachavaram) Presented at the conference, Cancun Mexico.
Re:Actually this HAS beed done before (Score:2, Interesting)
The only GA-originated increments for Shellsort that I could find online was the one in Crew fall 2001 report [allegheny.edu], but that sequence didn't perform very well, & they didn't document their technique except to say that it wasn't a full-blown GA. (They placed some limitations on it. They also mention that Simpson & Yachavaram [allegheny.edu] used only a limited GA.)
I wondered if a GA that fully manipulated bit-strings could produce a sequence as good as Sedgewick's, & I also wanted to document the technique so someone else interested in GAs could implement their own or at least be a little entertained with how it was done.
Other sorting attempts using GA's (Score:2, Informative)
Thus, optimizing sorting algorithms with GA is not new in itself.
Re:Other sorting attempts using GA's (Score:1)
Even though one of my first versions only used half of the genes due to a bug, it still managed to create a network that was just one step larger than the theoretical minimum.
After Ctrl-C'ing it I implemented a way to actually print and save the network, after which it never got better than three steps worse than the minimum. Damn.
For those who cannot grok Lisp shell sort... (Score:1, Informative)
Typo in "run 8" ? (Score:1)
Re:Typo in "run 8" ? (Score:1)
be 1325 (one number). I'll check and fix it
when I get home. Thanks for catching that.
Just like to add... (Score:2)
You may also be interested in Tierra [atr.co.jp]. Open-ended networked evolution - it's pretty nifty.
Got to love watching the computer solve problems for you. Ahh...
Limited number of test inputs (Score:1, Interesting)
Also, randomly ordered inputs are not common in practical applications. Did you test (or consider testing) partially sorted and partially reverse-sorted numbers? Presumably the contrived sequences (such as Sedgewick's) are balanced to provide good results for both random and non-random data.
Also see comb sort (Score:2, Interesting)
http://cs.clackamas.cc.or.us/molatore/cs260Spr0
http://world.std.com/~jdveale/combsort.htm
Fitting... (Score:3, Funny)
efficency vs proof - Do we want this? (Score:3, Insightful)
Is the efficiency gained worth the difficulty in proving the new genetical enhanced algorithm works?
Its hard enough with vigorous testing to prove simple programs work without fail without introducing some random "Genetic" changes that may break the algorithm for some input cases.
One of the problems with AI and self modifing algorithms is that as it learns
Re:efficency vs proof - Do we want this? (Score:2)
In this case, the algorithm is guaranteed to work with a wide range of parameters governing its precise functioning. The genetic algorithm is only being used to search the parameter space for the most efficient parameters. Since the algorithm will work no matter which parameters you pick, this is a perfectly fine way to use genetic algorithms.
Also, genetic algorithms aren't self-modifying.
Shell Sort is proven, but good increments can't be (Score:3, Interesting)
Shell Sort works by roughly sorting, then sorting a little finer, and so on, until you finely sort every piece of data, and hopefully at this final and potentially most time-consuming step, there isn't a lot of work to be done.
The thing is, there's no platonic God-like perfect mathmatically-proven best choice of how fine or rough the early iterations should be. No matter what increments you use (within certain constraints) Shell Sort will still work every single time (the last step at increment 1 guarantees it) -- it's just a matter of performance. Nobody knows what increments work best, and you can't just mathmatically figure it out.
Re:Shell Sort is proven, but good increments can't (Score:2)
My frame of mind is has changed. Right now is I'm working on very time constrained real time stuff, and when I have to do it in X seconds I like to know what the worse case is, not best average case.
The genetic algorithm sounds like and iterative test to try and find the optimum increments for shell sort. Really nothing revolutionary.
I couldn't recall from the test description, but it doesn't seem that different arrays to sort were used to evolve the parameters of the sort (increment list). If this is the case the solution is just a sort optimized for a particular dataset and of little use.
I would be interested to see how well the algorithm would work elvolved against different random data sets (almost in order, always reverse order, random).
And then compared with others Standard increments.
They could have a competion like they have to come up with good prisoners delima algorithms, but instead come up with the best shell sort increments...