Timetabling Algorithms? 60
Phil John queries: "I'm developing a system for a University Student Union which employs 400+ student staff. Allocating shifts up till now has been a manual task keeping 1 member of staff busy for at least a day. I've been asked to implement a Web/SQL based system to get student availability (which changes each week), get shifts required and automatically allocate shifts. Now, here's the problem: how do I handle the timetabling bit? Most solutions require genetic algorithms and while I can understand and implement them (having a degree in AI and CS) I'm not going to be around after the summer and this creates problems for people maintaining my code. Cheers for any help you guys (and gals) can give me!"
This should be relatively trivial (Score:1)
Good luck, it sounds like you'll need it!
By hand? (Score:4, Insightful)
Try a greedy add or even loading heuristic. You will find that both are extremely easy to implement and maintain, and often do a "good enough" job for most manual scheduling problems.
Here is a heuristic I wrote for scheduling Navy Instructor Pilots: Get a list of all of the holes in the schedule. Find all candidates for all holes. Find which hole has the fewest candidates (greedy). Find out who has gone the longest without serving this duty (simple even loading). It works better than 97% of the time.
Genetic algorithms are a pain in the --- to write and maintain (and I will be teaching a class on them in the Fall at UCSB, so there's a proper endorsement). Talk to the person who currently writes the schedule and see how they do it. The logic in such an expert system is likely to do a decent job. It's not like you are scheduling for a manufacturing plant where a 2% improvement in scheduling can mean 2% improvement in revenues.
Re:By hand? (Score:2)
But otherwise, I have to say that just because something is NP-complete doesn't mean you can't get the optimal solution in a reasonable amount of time. Sure, your system wouldn't scale to 10,000 people, but that's probably not a concern. If you have a good heuristic and the data is reasonable, A* might do the trick. I'm using A* for path planning and it meets my requirement that it can't take longer than 2ms to find the optimal path. It helps that my maximum path length is pretty short and the paths are usually a straight shot matching the hueristic.
I realize that worst case would be unworkable in your case, especially if you brute-force it, but I'd be surprised if you ever see anything approach the worst case.
Re:if you really have those degrees.... (Score:1)
you must go to must have the same AI and CS degree as the first guy
}
Don't Start From Scratch... (Score:4, Insightful)
Have you database detect collisions between the current schedule and the new student availabilities. Then try to juggle only the students with a collision. You won't always be able to do it, so a backup layer would juggle some of the other students, preferably randomly chosen (ie the most-available students don't always get shafted), until it works. This reduces the processing load of an otherwise NP-complete problem, and actually encourages the more stable students with a more stable schedule.
One drawback, initialy schedule needs to be entered by hand, but only once.
Why is this NP-complete? (Score:1)
I thought bipartite matching with integral weights (probably no weights in this case, actually) was polynomial time. Am I missing something that makes this NP-complete, or did you just make that up?
Re:Why is this NP-complete? (Score:1)
Incidentally, I was also assuming normal considerations, such as prefering one long tour over several short ones.
Re:Why is this NP-complete? (Score:1)
But even finding the optimal solution for a bipartite matching problem is polynomial time. This is just matching, right? (His description is sort of vague...)
You're a hammer, and everything is a nail (Score:4, Informative)
Stuff like this has been around forever. Try looking up keywords like "optimization," "linear programming," "constrained optimization," and "operations research."
There are tons of packages out there to help you out. Good luck.
Re:You're a hammer, and everything is a nail (Score:1)
Don't forget... (Score:2)
The coward's way out. Often works on real world problems.
Re:Don't forget... (Score:3, Insightful)
Schedle For Week of 6/22:
It's a good thing he'll be gone after the summer is over, because he won't be around when the users discover that he's missed some important rule. Then, the future users will either have to patch in some extra rules themselves, or abandon the system. As Nelson says, "Ha Ha!"
Anyhow, the correct thing may be to not implement any rules. Keep the human around to actually fill in the blanks, and just use "Mr. Computer" to simplify coordination between the 400 students and the scheduler.
Figure out how the human scheduler does his work now, and automate the truly tedious parts of it (like, copying the schedule over from last week, going through hundreds of slips of paper with time-off requests, totalling up the coverage for all the time slots for all the days, and marking off requested scheduled times). Then, let the human do the heavy lifting, and deal with things like "Suzy doesn't like to work the Bill", and "George wants a few extra hours this week, if he can get them", or "Brenda will only work on Friday evening if no-one else will do it."
But automating the whole thing a few days before you leave town forever is just begging to create a nightmare application that doesn't quite do exactly what they need in a way that anyone there is trained to understand. The fact that you seem concerned about "algorithm maintenance" instead of "rule maintenance" makes me doubly sure that you probably shouldn't build this system.
Well, unless you just want to build the thing, for fun, and you really don't give a damn if they use it. Then, go ahead. Knock yourself out. But don't prentend like you give a poop about maintenance, 'cuz it ain't gonna be maintained.
Design Tip: SQL Based Scheduling Systems (Score:2, Informative)
SQL Based Scheduling Systems [skinewyork.com]
Note, I haven't used this system (yet) but I enjoy the elegance of the bit-field query.
Good luck,
-malakai
Sure (Score:5, Funny)
I have a degree in "AI". Really. I have no reason to just make that up.
Anyhow, it turns out that I am unable to implement standard AI algorithms. I have no idea where the standard AI algorithm repositories are on the net, and I am unaware of any of the standard textbooks on the subjet. In fact, I am unable to do even the most basic library research on my own.
Should I sue the school that gave me this fucked up, worthless degree? Or are my shortcomings entirely my own fault?
Sincerely,
Phil (The Turnip Head) John
Re:Sure (Score:1)
For a definition of randomness, see /. moderation.
Re:Sure (Score:4, Funny)
You missed the subtile difference the moderators found all-important. His first, "redundant," post was signed:
"Sincerely, Bobby."
His second, "funny," post was signed:
"Sincerely,
Phil (The Turnip Head) John"
See the difference? The second is much funnier, making the first post clearly redundant.
Slashdot software moves comments around. (Score:1)
Funny, but that's probably not the explanation. Sometimes the Slashdot software moves comments around. Maybe the one that is first now was second before.
Re:Sure (Score:1)
Perhaps he got his diploma from one of those 'non acreddited universities' that I keep getting mails about
Genetic?? (Score:4, Insightful)
open source it! (Score:3, Funny)
Well, since this is slashdot, why not open source it, then the "community" can maintain it?
just kiddding.
Constraint propagation and truth maintenance (Score:1, Insightful)
You surely recognize this as a constraint satisfaction problem then. Set up the variables, their domains and the set of constraints and use a standard package to solve it (or give a reasonably good approximation). This is your initial solution, and then you incrementally resolve as the constaints change. How much do you expect people's availability to change? Are the constaints going to be drastically different?
Do you need to dump a nice explanation of why your code produces the resulting solution? Jesus christ. What kinda kids do people hire to do these jobs. And this isn't even that difficult.
Aha! (Score:4, Insightful)
IIRC, most GA papers use either elevator control or personnel scheduling as example problems, much like many OO texts use bookstores. Therefore, Mr. John has come to believe that personnel scheduling is best solved by GA.
At least, so I hypothesize. It seems like a fairly straightforward A* search problem to me, although the suggestion of working from previous schedules and just fixing what needs fixing as opposed to starting from scratch is a good suggestion.
Additionally, so what if it is NP? Frankly, if it took an employee a day to do, the machine should have at least 24 hours to work on the data to come to a solution, which is intuitively more than reasonable. Sure, initially fan out is large, but the more restrictions students give the more fan out diminishes as you decend the solution tree. Honestly, you've got a pretty interesting heuristic to write, IMO.
Re:Aha! (Score:1)
Nice try with the A*. Next time look up what it means.
In AI, this problem gets called "Constraint Satisfaction." Typically, a logic based programing language gets used, such as Prolog. While he can write the program in C, the nonintuitive way he would have to handle backtracking would seem difficult for other coders to understand. Prolog seems understandable to anyone who knows logic (ie any programmer). Writing backtracking programs in Prolog becomes very easy since Prolog has backtracking as a fundimental component. Let us look at how this program would look in Prolog.
We have 5 days to fill and 4 workers to do it. Ann can work on Thu. Betty can work Mon-Tue. Joe can work Mon and Fri. Katie can work Tue - Thu. That finishes the program! You cannot do that as easily and as simply in almost any other language. You can also continue to ask for alternatives for the schedule (the program lists 9 plans). He can extend this by adding times (obviously) and even preferences for times. The system should feel simple and seems easy to maintain.
I don't mean to dethrone your CS knowledge, but A* doesn't even apply in this domain. Logic planning can take a great amount of time in other language. He needs the right tools for the job.
---
Crulx
Re:Aha! (Score:1)
Unfortunately the prolog program (as usual) is extremely inefficient. It'll work for a few people, but for hundreds, exhaustive backtracking search is really slow. Matching solves it (at least the problem you describe, which may be different from the "timetabling" problem) in polynomial time.
Re:Aha! (Score:1)
Yes, this does look like a path finding problem to me. Every unit of time (your example used days, but more likely we're talking about hour/position combination) needs to be filled. Sort them arbitrarily (honestly, for optimization, you probably want to sort them from fewest options to most.) The goal state is every block being filled. The path between nodes is assigning a worker to a time/position block.
So, now it's defined as a path searching problem. As far as a heuristic goes, I'm not sure I can provide on that's necessarily h*(), so you may be right: it might not be an A* problem, but I think some h() might be found for a best-first search, and that the choice of that heuristic could lead to a system that is fair to everyone. For instance, an open node that involves assigning work to someone over the ideal number of hours (either in a local preference kind of way, or in a global fairness kind of way) sorts lower than someone under the ideal hours, for instance. Similarly, workers with high availability sort lower than low availablity workers (since they're more valuable in filling time blocks.)
Incidentally, how did you think Prolog solved this problem? Quite apart from responding to an algorithmic suggestion with a programming language, Prolog has to do something along these lines behind the scenes. And while I'll freely admit that my academic programming was done with Scheme, Lisp and Verilog (for variety), I wouldn't expect much in the performance arena from Prolog, mostly from the complaints of my colleges. As a result, it might not be the best solution for 400+ student employees.
On the other hand, it sounds like it would be quick to test it out, and if it works, the job's nearly done. If it doesn't it can be abandoned for C or C++ with some nice design.
Parting shot: whether it's a job for Prolog or a best-first search in C, it's still not a very good candidate for the more difficult to code and understand genetic algorithms, from which you can't even guarantee a result.
Do you know what NP means? (Score:2)
Do you know what NP-hard means?
It means there's no known polynomial-time algorithm, which means the best you can do is probably exponential time. It means 24 hours is barely any more helpful than 24 seconds. A factor of 3600 extra time with an NP-hard might allow you to schedule 40 staff instead of 30, but you'd probably need longer than the lifetime of the universe to reach 100, let alone 400.
People don't just avoid NP-hard problems just to be picky. They avoid them because they simply don't work for large problem sizes in any reasonable amount of time.
Re:Do you know what NP means? (Score:2)
Strictly speaking, yes, I do know. I also know that path search is, strictly speaking, NP-complete, and so no worse that exponential. I also tend to use the non-academic's (admittedly) lazy shorthand of NP to mean anything in the NP set of complexity. Certainly, P is a subset of NP, but if I knew the problem was P I would have said P, n'est pas? Secondly, even though I say that the problem is NP doesn't mean that anyone can prove that it isn't P, yet.
Finally, without more work on the problem, neither of us can definitively say what the time complexity of this particular problem is - while it's almost certainly exponential, the exponent might be quite small, so that 400 staff might be well within the feasible solution range. However, theoretical and imperical limits should probably be put on the use of the system, and every optimization possible applied. I still laud the suggestion of reusing last week's data as a basis for todays.
Re:Do you know what NP means? (Score:2)
In the "Ask Slashdot how to do my job department" (Score:1)
Your S.U. manager should know how many employees he needs where and when he needs them. Simply break the days up into fixed shifts for each day (i.e. 2 people from 8am-10am, 4 people from 10am-2pm, 2 people from 2pm-4pm, etc.) Let the students "bid" on shifts (i.e. first come first serve) or pick the shifts based on a lottery. For shifts that aren't popular, use the hammer of employment (i.e. do you like your 10am-2pm shift then you will work the 6pm-10pm shift) Since your student's class schedule shouldn't change for at least 16 weeks, once you have 1 week of scheduling done copy it to the next 15 weeks. If a student employee needs to change shifts he needs to find his own replacement.
If you really got your pecker hard for a computer program try a freaking random number generator and drop students into shifts and then allow a human to correct what few openings are left. You can call it the "5th order polynomial with a 32 bit ROR" AI scheduling algorithm. (Hey, the Liberal Arts students will believe it.)
You aren't the first person that needs to schedule 400+ employees on a rotating schedule. This is how the real world does it buddy!!!
That AI degree isn't going to do anything but impress your ignorant friends at parties. Now go play in the street.
Re:is this the same phil john? (Score:1)
Greedy Algorithm? (Score:2)
"shedule the class E that has the earliest finish time then recurse on the classes that start after E ends"
seems like this, applied and limited by the available personel, and per position could work pretty easily, and be easy to maintain.
THe military Solution (Score:1)
Draw a grid. List all the people down the left in Alphabetical order. In the first cell of the grid number 1 all the way down. Highest person on the list with the lowest number gets the detail. Put an X in this box. Add one to everyone elses value and carry over to the next column. Repeat ad-nauseum.
Glad I don't do that anymore.
Re:THe military Solution (Score:3, Interesting)
The reference on how to do Army duty rosters properly (i.e. without screwing people or pissing them off) is to follow the directions in AR 220-45 [army.mil] and to fill out DA Form 6 [army.mil].
As another former S1 (and later G1), this was the only way to do it. I saw command investigations where people were repremanded for not following the regs properly (or at all). In this case, the regulation does set out a fair way to rotate duty in a group.
I'm actually using this for scheduling Level 3 support in my development group and there've benn no complaints...
GA for optimization, not solution (Score:2)
There are a number of papers on the application of GAs to the optimization of the timetabling problem, but all of the require an initial solution. This is a basic premise of GA - start with a working solution, modify it randomly, and test it, until you find something that works better.
The initial solution comes from solving a complex system of constraints. Typically a collage will require the modelling of Students, Teachers, Classes and Venues. A group of Students enrolled for a particular Class, plus the Teacher teaching the Class, must not have clashes with other Classes, and the Class must not have a Venue clash.
Venues are assigned to a class based on teacher locality and class size. Students are assigned to a class based on registration. Teachers are assigned to classes by a department.
A useful heuristic for getting the initial solution is to allocate class time in "streams": identify courses that cannot be taken simultaneously, and schedule them at the same time. e.g. CS 1, CS 2 and CS 3. If the same Teacher is required for different subjects in the stream, you have to take out all but one such subject.
You may also have to consider oversize classes - sometimes a class of (say) 900 students must be split into 2 venues; and sometimes one teacher must teach all 900, so instead of venues, there is a split into two classes. This is rather more complex than it sounds: some students will only be able to attend one of the classes, others could attend either. Those that can attend either need to be assigned to a particular class, to prevent the venue from being overfilled and "starving out" those who need to attend that class, because they have a clash with the other.
On the other hand, if it takes a single staff member "at least a day" to schedule all of this, then you are probably looking for a less complex solution. I was asked by a high school (1000 pupils, approx. 40 staff) to assist them by writing an timetabling program; they took up to a week between three planners to schedule their classes.
Re:GA for optimization, not solution (Score:2)
Just to follow up on my previous post, here are some resources:
No, this is not a problem for the faint of heart.
Annealing (Score:1)
How about this algo (Score:2, Insightful)
/. is the answer (Score:1)
(Useful?) Resources (Score:2, Informative)
GAs can be used to find sub-optimal or optimal (given enough time) solutions for NP-complete problems, Timetable (TT) being one of them.
However, I think the core benefits of using such a heuristic, in this case, might be:
1. At any point in time, you can tell the GA to stop, and you can get a feasible (but probably sub-optimal) soltuion from it,
2. GAs are, in some small ways, parallelizable.
Of couse, you're probably not looking for a whiz-bang super-duper fast utility, so (2) probably isn't useful.
Perhaps you can find some use of these:
- A presentation [csubak.edu] I did on why TT is NP-Complete,
- I spoke about a GA approach [csubak.edu] for solving the TT.
Note that the time spent by a GA in finding an optimal solution is not guaranteed to be any better than the speed of an approximation algorithm, nor even the speed of the naive approach, when solving TT (or any other NP-complete problem).
The GA approach mentioned above had been used successfully in practice, though.
This is a rather non-trivial problem. (Score:1)
I ran into a similar, if somewhat simpler problem recently. 13 employees. 6 shifts per day. Various assorted constraints. In particular, there were frequent vacation constraints (on the order of 1-2 weeks per person per month).
Ignoring these constraints, there are 720 possible ways to assign 6 people to 6 different shifts. There are 1,716 possible ways to choose 6 people from a set of 13. Given 31 days in a month, that's 7*10^188 possible schedules forming the search space.
The problem domain is NP complete. To examine every possible solution, we are looking at a computational time beyond the eventual heat-death of the universe.
If we don't need an optimal solution, but merely a solution, the problem becomes much more tractable. Until we consider the constraints.
The constraints may very well dictate that no valid solution exists. The catch is proving that no valid solution exists, for the non-trivial case, without examining every possible solution. I have not yet investigated how feasible this may be. It is certainly very domain, or rather constraint, specific.
GA's and other evolutionary programming techniques won't guarantee that you will ever find a valid solution, let alone the running time.
On the other hand, exploring the space from the most constrained timepoint outwards to the next most constrained timepoint, and so forth, backtracking as necessary, may have the unfortunate effect of exploring the entire search space. Properly coded, an interactive UI could permit the user to pause the work and "assist" by adding additional constraints, thereby limiting the search space.
On the other hand, provided you don't get into an over-constrained scenario, you can get awfully far with a simple greedy algorithm that attempts to load-balance.
Then there are the more mundane aspects: The UI itself. Data storage and retrieval. Adjusting the schedule to changing constraints (replanning). Printing and publishing the results. Providing the status of the search currently in progress.
They will occupy a fair bit of coding time as well.
I suggest you design the system. Figure out all the data entry, output, and UI bits. Don't implement. Just plan it all out. Then estimate how much time each part will take to write, code, and test. Especially test. Add up all the times, and see whether it's even possible to complete this software before you leave...
Greed is good... (Score:2)
This is a variant of the napsack problems. It's a little more complicated, but not much. You have a collection of objects (employee hours) and a collection of containers (hours of time that need to be filled by some employee). You also have various restrictions (certain employees can't work at certain times, or equivelently certain employees can only work at certain times). Just start grabbing employee-hours at random, and shoving them in. If you reach an impasse (no employee can work a given hour) you backtrack to an earlier decision and pick differently.
There are degenerate situations where, although a solution exists, this algorithm will take an excessive amount of time to find the solution. But I'd bet you find them in this situation.
Brian
From The Stony Brook Algorithm Repository (Score:1)
Click here [sunysb.edu] for the algorithms.
Disclaimer: I haven't used them.