Can AI Be Trained to Grade CS Homework Assignments? (medium.com) 58
Long-time Slashdot reader theodp writes: Tech-backed Code.org reports that as part of efforts to provide scaled human-centered education, the Stanford AI Lab analyzed 711,274 solutions to interactive block-based Code.org programming assignments submitted by 3rd and 4th grade students to develop AI-based solutions for automatically grading student homework. The research project received funding from LinkedIn founder and VC Reid Hoffman, who is coincidentally a $1+ million supporter of Code.org, which provided the student data.
Autograding systems are increasingly being deployed at all levels of education to meet the challenge of teaching programming at scale. So, will AI make Computer Science grader and undergraduate teaching assistant jobs obsolete?
Autograding systems are increasingly being deployed at all levels of education to meet the challenge of teaching programming at scale. So, will AI make Computer Science grader and undergraduate teaching assistant jobs obsolete?
What is new about this? (Score:5, Informative)
Re: What is new about this? (Score:1)
Asking for a freind. (Score:2)
An AI freind wants to know
Re:What is new about this? (Score:4, Interesting)
Well your class assignments were probably multiple choice. Let's say I asked you to write a sort routine and then were going to grade you on the quality of your work. That's something you can train an AI to do.
Whether you can train it to do a *good* job is a different question, and whether it would be worthwhile is yet a different question again. Most AI amounts to cheap, massively scalable mediocrity.
Re: (Score:2)
Most AI amounts to cheap, massively scalable mediocrity.
Indeed. Or worse. Unfortunately, there are a lot of mediocre (or worse) people out there that do not understand that.
Re: What is new about this? (Score:3, Insightful)
Sometimes it is a brilliant guy, but he requires a lot of "maintenance" since he has trouble understanding humans and doing business.
In my experience, the one
Re: (Score:2)
Well, most people *are* mediocre at what they do. There's nothing wrong with realizing it, the problem is believing you are immune to it.
Re: (Score:3)
No. Most people are average at what they do. Because most people are average.
Re: (Score:2)
Well, usually. There are some "double-hump" distributions out there. Coding-skill often seems to have that.
Re: (Score:2)
You misunderstand what is being talked about.
Re: (Score:2)
Let's say I asked you to write a sort routine
A big component of the grade should be whether it properly sorts the input. You don't need AI for that.
Re:What is new about this? (Score:5, Interesting)
Let's say I asked you to write a sort routine
A big component of the grade should be whether it properly sorts the input. You don't need AI for that.
Indeed. (way back in the mid 80s) We had an assignment in the Programming 101 (C / Unix) class to read numbers in one layout, sort them, and write them in a different layout. I wrote the input/output routines in C and this shell script:
readin < ifile | sort | writeout > ofile
Got full credit -- with a note about thinking outside the box.
That professor hired me after I graduated to be a systems programmer / systems admin with Unisys at NASA Langley for their new Cray-2, Convex and other Unix systems.
Re: (Score:2)
If you're sorting numbers, shouldn't that be sort -n ?
Re: (Score:3)
You're correct, but cut me some slack on my memory, it *was* about 35 years ago... :-)
At the basic level (Score:5, Insightful)
Plain old rule-based Lint plus automated tests, sure.
Don't use neural nets, because they are awful at explaining why they do stuff. Not useful for teaching.
Re: (Score:3)
More to the point, code is written for two reasons: First, it's written to perform a specific task. Second, it's written to document that process to other humans.
It's beyond trivial to ensure the first requirement is satisfied. You plug in the code and test it. Does it produce the correct answer or not?
But the second job of the code is harder to quantify. Does it read clearly, explain the process logically? Are variables and functions named well? Are there clear comments describing the code? Are thin
Re: (Score:2)
It's beyond trivial to ensure the first requirement is satisfied. You plug in the code and test it. Does it produce the correct answer or not?
It's not so trivial to ensure it produces the correct answer given all possible valid inputs, not to mention gracefully handles all invalid inputs.
Education Research already tells us the answer! (Score:5, Insightful)
As an education researcher (more specifically, a Learning Scientist) who does research on assessment, I already have high confidence in what will be the potential value/impact. [drumroll, please]:
For some homework assignments, that have limited ability to help students, this could work. But for many students, this won't be valuable.
To be more specific, this could be a valuable effort as long as the feedback from the homework is what students need. Will the automated grading tell the learner why they got the answer wrong? Or will it just point out that they made a mistake? In relation to your own learning, think about how often you learn when someone tells you that got something wrong. Did that help? Or even further, think of the times you got something wrong and then someone showed you how to do it the 'right' way. Did that help? I bet the answer is that it did help sometimes, and then other times it wasn't really valuable since you needed to develop a better understanding of what you weren't understanding.
The funny thing is that people (even educators) often forget the value of assessments, including homework. They only think of assessments as summarize, letting the learner (i.e., student) and instructor (i.e., teacher) whether someone knows something or not. But, at most, that's about 50% of the value of assessment. The other factor is formative, or whether the assessment (including homework) help the learner understand what, if anything, is preventing them from understanding (mastering the skill, using the knowledge, etc.)
Re: (Score:2)
I absolutely concur. And specifically when it comes to programming, code has high variability, good programmers will create hooks for building onto it later, and often there are multiple ways to do things. It is not uncommon for me to put hooks and bits of stuff in my code intended for use later. Or subsections for debugging or testing. Maybe a tracker to figure out how long certain sections are taking to execute so I can determine if optimization is needed.
If a professor looks at my code with "useless" var
Re: Education Research already tells us the answer (Score:2)
Re: (Score:2)
That's a fair question (want to be an education researcher?).
The key challenge of determining the 'correctness' of an assessment is to figure if the learner is in one of four states:
A. They got the problem right and they understand what the problem is measuring.
B. They got the problem wrong and they don't understand what the problem is measuring.
C. They got the problem wrong and they understand what the problem is measuring.
D. They got the problem right and they don't understand what the problem is measurin
Hooks and comments (Score:2)
I know that Agile isn't the Holy Grail, the Silver Bullet nor the Ark of the Covenant. That said, one of the Agile aphorisms is "if you think you are going to need it, don't include it"
You are also a conscientious developer who puts a lot of important shit in you code comments. That said, if some clever implementation needs 'splainin', maybe it needs to be coded in a way that excuses do not need to be offered as to what it is being done that way?
OK, OK, profiling. Maybe there are profiling tools, ju
Re: Hooks and comments (Score:2)
one of the Agile aphorisms is "if you think you are going to need it, don't include it
There's something to be said for keeping things lean, but there's also a need to anticipate future functionality. I've seen far too much "Agile" code that had to be trashed entirely because lack of foresight ended up with code that "painted itself into a corner." I've too many programs that had to be essentially discarded entirely because the general approach at solving a problem was too limited, inflexible, etc. and it ends up being better to just start fresh.
Re: (Score:2)
Well, yes. Some actual teaching experience and a working mind is enough to see that. Sadly, these decisions are usually made by people that lack at least one of these and often both.
COBOL programmer reeducation camp (Score:3)
One of my formative coding experiences was a summer internship doing COBOL programming.
A second formative experience was a course, I guess it was called Systematic Programming using Pascal.
The professor lectured on how you could tell which language someone learned before Pascal by how they program in Pascal. Sure shootin', I got an assignment marked down for relying on global variables, which of course, was a "tell" for my prior COBOL experience. The TA doing the grading gave me a scolding in the gra
Re: (Score:2)
OK, Java at least forces scoping all variables to a class rather than allowing true globals. But if one employs the God Object Pattern, the variables of that class are globals for all intents and purposes.
Let he who has never coded a God Object cast the first null pointer to raise an exception!
The reason for a God Object is that this is the way to obtain the procedural programming paradigm when you just want to run numerical simulation experiments and after teaching your classroom sections and having
Re: (Score:2)
Has anyone studied the benefits (or issues) with having students learn first, then teach as final proof of knowledge? I think becoming the teacher, even only part time, would change how students interact with and think about their teachers. Also people can only focus on so many things (1 teacher for 40 or 200 students?)...so having lots of people give a little feedback about fewer people should increase the quality of those comments.
Or creating material they'd wish they'd had to learn a concept, giving th
Perhaps, if innovation was never to occur. (Score:3)
Correct recognition of a student's innovative, but wrong solution is imperative if we are to harness emerging talents, and not miss potential that needs a human mentor's guidance to be realised.
Stupid question from/by/for stupid people (Score:1)
"AI" is just about the most uninteresting thing you could throw at grading assignments.
It's no coincidence that teaching assistants are typically students themselves. Teaching others teaches you, too.
Now first you're clamouring for "more coders" and when an opportunity arises to further their education, you jump on the idea that you can write an AI to do it all instead?
You need your head examined. Or just admit that the "we need moar cod4rz" is a ruse, build an AI to do the job, and fire everyone your AI
It could work. (Score:2)
If you do not mind crappy grading, yes (Score:2)
The first problem with this is that there will be not enough meaningful feedback. But the worse problem is that as soon as you change the assignments, you will have to build up a pool of manually graded assignments. The manual grading will either be worse, because those doing it eventually lack experience or they will not even be done at all.
And so, on the eternal quest of making things cheaper, quality will suffer and that is not good at all.
AI can't ask why... (Score:3)
Re: (Score:2)
A good teacher can also later show the class various examples of how the problem was solved. This would be valuable feedback beyond a simple fairly useless grade.
Useless grades (Score:2)
Yes, grades are useless. Certainly SAT and GRE scores have evidence in being useless in predicting success.
So then as a selection tool for entry into elite schools and being hired into prestige (i.e. high paying) jobs, we won't use them. We won't evaluate anyone's performance because the Pointy Haired boss doesn't know anything.
I guess we will do away with the whole concept of Meritocracy -- everyone hates it anyway. I guess we will fall back on nepotism and networks of personal connections? This i
Re: (Score:2)
Re: (Score:2)
The class TA (Score:2)
I commented above that the TA for the class grading the assignment does not ask the student this question either.
Re: (Score:2)
Re: (Score:3)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Yes, it’s trivial. (Score:2)
You have not been in college for many years (Score:2)
The stairs method is no longer usable at the U.
Students compare each other's returned papers. If their study partner got full points and they got marked down for the work they copied, they will be in your office fast enough to make your head spin. Don't think of accusing a student that they copied when they caught you in not giving the same points deduction. "Oh, get your friend to come in here that I can take the same deduction!" Yeah, right.
I counsel TA's and graders, whatever credit or partial c
Of course AI can grade CS homework.. (Score:3)
Homework CS assignement: (Score:2)
Now rate this!
"coincidentally" (Score:3)
"The research project received funding from LinkedIn founder and VC Reid Hoffman, who is coincidentally a $1+ million supporter of Code.org, which provided the student data".
That's... not what "coincidentally" means. The two facts are quite related.
For your homework (Score:1)
Um, aren't those called unit tests? (Score:2)
It's not complicated.
Grade This (Score:2)
When I was in college, I was given the assignment of writing a LISP interpreter in SNOBOL. I decided to go them one better and turned it in with an English grammar parser written in LISP to run on it. [How to make a TA cringe...]