Last week, the New York Times alerted readers to the possibility of computers grading college-level student essays. As with any news featuring the use of ‘artificial intelligence’ to replace humans, reactions to this announcement feature the usual skewed mix of techno-boosterism, assertions of human uniqueness, and fears of deskilling and job loss.
First, a sample of the boosterism:
Anant Agarwal, an electrical engineer who is president of EdX, predicted that the instant-grading software would be a useful pedagogical tool, enabling students to take tests and write essays over and over and improve the quality of their answers. He said the technology would offer distinct advantages over the traditional classroom system, where students often wait days or weeks for grades.
“There is a huge value in learning with instant feedback,” Dr. Agarwal said. “Students are telling us they learn much better with instant feedback.”
Then, the assertions of human uniqueness.
From the Professionals Against Machine Scoring of Student Essays in High-Stakes Assessment:
Let’s face the realities of automatic essay scoring. Computers cannot ‘read.’ They cannot measure the essentials of effective written communication: accuracy, reasoning, adequacy of evidence, good sense, ethical stance, convincing argument, meaningful organization, clarity, and veracity, among others.
From Professor Alison Frank Johnson at Harvard University (Letters Section):
Only a human grader can recognize irony, can appreciate a beautiful turn of phrase, can smile with unanticipated pleasure at the poetic when only the accurate was required.
From Professor Cathy Bernard of the New York Institute of Technology (Letters again), resisting the submit, get feedback, revise, resubmit model (which to my mind, sounds a great deal like a standard iterative writing process):
Writing is thinking, and revision is a slow process, unpredictable and exploratory. A piece of writing, like a cake taken from the oven, needs some time to cool before the revision process can even begin.
Finally, on Twitter, there was ample commentary to the effect that the ‘other tasks’ in ‘freeing professors for other tasks’ would merely be the the revision of resumes as professors looked for work elsewhere, outside the academy, their jobs taken over by the latest technological marvel promoted by the Borg-like combination of corporate rent seekers eager to enter the academic market, and university administrators eager to let them in. Fair enough.
Still, I wonder if matters are quite so straightforward.
First, the assertions of human uniqueness in the domain of reading and writing leave me cold. They do not take into account the possibility that greater exposure to human examples of writing, especially essays graded by human professors–better learning data–will significantly improve the quality of these automated graders, which rely on statistical machine learning techniques. (I’m note sure what ‘realities’ the PAMSSEHSA want us to face; computers can read; you can verify this the next time you enter your name in an online form. As for the lacunae they point out, I fail to see an argument that these are not achievable in principle.)
Second, I note that both Professors Johnson and Bernard are writing from private institutions where, I presume, professors often have teaching assistants to help them with grading. From my vantage position at a severely underfunded public university, I admit to wanting help with grading student assignments. I would be able to assign more assignments–especially in upper-tier core classes–if I could rely on some assistance with grading. In general, as with the discourse surrounding online education, the same fallacy is committed here: comparing the worst of A with the best of B. The original New York Times article features the obligatory story of ‘stupid computers’ fooled by human trickery, and the letter writers wax lyrical about the magic of professors grading student writing. But what about tired, overworked professors who can only provide cursory feedback? Would their students do better with computer grading? Would every human professor always do better than an automated grader?
Thanks for the interesting – and balanced – post!
On my most recent post I ranted about the weird ‘algorithms’ HR uses in the selection of candidates. But you are right: Software and computers do not seem to be the core problem – semi-automated keyword matching done by clueless HR persons or rash grading of overworked professors is as disturbing already.
I guess the core issue really is that there is not enough time or there are not enough ‘human resources’ to do these tasks (or there is not enough money to pay reasonably qualified people) – and we try to solve this issue by throwing technology at it.
They can’t possibly be serious? Surely automated grading will result in nothing more than automated students and thinkers.
Surely humans and computers will be able to co-exist in the professional world without the fear of a Terminator-like crisis?
It is clear that software like this will help teachers with a growing problem: deal with the assignments of hundreds of students. Especially when college enrollment is on a constant rise. Indeed, this gives teachers more time on their hands…not to start fixing up their resumes, but hopefully, to increase their mastery in their disciplines. Surely this would make for a brighter and perhaps much more competitive and innovative gang of teachers, professors, and doctorates?
But alas, I wonder if Brooklyn College could taste the potential rewards of such a venture.