Essay-Grading Software Seen as Time-Saving Tool
Teachers are looking at software that is essay-grading critique student writing, but critics point out serious flaws when you look at the technology
Jeff Pence knows the easiest way for his 7th grade English students to enhance their writing is to do a lot more of it. However with 140 students, it would take him at the least a couple of weeks to grade a batch of their essays.
So the Canton, Ga., middle school teacher uses an online, automated essay-scoring program that allows students to have feedback to their writing before handing in their work.
“It does not inform them how to handle it, but it points out where issues may exist,” said Mr. Pence, who says the a Pearson WriteToLearn program engages the students almost like a casino game.
A week and individualize instruction efficiently with the technology, he has been able to assign an essay. “I feel it is pretty accurate,” Mr. Pence said. “could it be perfect? No. However when I reach that 67th essay, i am not accurate that is real either. As a united team, we have been very good.”
Using the push for students to be better writers and meet with the new Common Core State Standards, teachers are hopeful for new tools to help out. Pearson, which can be based in London and new york, is one of several companies upgrading its technology in this space, also called artificial intelligence, AI, or machine-reading. New assessments to test deeper learning and move beyond multiple-choice email address details are also fueling the need for software to help automate the scoring of open-ended questions.
Critics contend the application doesn’t do alot more than count words and so can not replace human readers, so researchers will work hard to improve the program algorithms and counter the naysayers.
While the technology has been developed primarily by companies in proprietary settings, there has been a focus that is new improving it through open-source platforms. New players available in the market, such since the startup venture LightSide and edX, the nonprofit enterprise started by Harvard University in addition to Massachusetts Institute of Technology, are openly sharing their research. Just last year, the William and Flora Hewlett Foundation sponsored an open-source competition to spur innovation in automated writing assessments that attracted commercial vendors and teams of scientists from about the world. (The Hewlett Foundation supports coverage of “deeper learning” issues in Education Week.)
“Our company is seeing lots of collaboration among competitors and folks,” said Michelle Barrett, the director of research systems and analysis for CTB/McGraw-Hill, which produces the Writing Roadmap for usage in grades 3-12. “This unprecedented collaboration is encouraging a whole lot of discussion and transparency.”
Mark D. Shermis, an education professor during the University of Akron, in Ohio, who supervised the Hewlett contest, said the meeting of top public and commercial researchers, along side input from many different fields, could help boost performance of the technology. The recommendation through the Hewlett trials is that the automated software be used as a “second reader” to monitor the human readers’ performance or provide additional information about writing, Mr. Shermis said.
“The technology can not try everything, and nobody is claiming it may,” he said. “But it is a technology that features a promising future.”
The initial essay-scoring that is automated go back to the early 1970s, but there was clearlyn’t much progress made through to the 1990s with all the advent for the Internet in addition to ability to store data on hard-disk drives, Mr. Shermis said. More recently, improvements have been made within the technology’s capability to evaluate language, grammar, mechanics, and style; detect plagiarism; and supply quantitative and qualitative feedback.
The computer programs assign grades to writing samples, sometimes on a scale of 1 to 6, in many different areas, from word choice to organization. The products give feedback to help students boost their writing. Others can grade short answers for content. To truly save time and money, the technology can be utilized in various ways on formative exercises or summative tests.
The Educational Testing Service first used its e-rater automated-scoring engine for a high-stakes exam in 1999 for the Graduate Management Admission Test, or GMAT, relating to David Williamson, a senior research director for assessment innovation for the Princeton, N.J.-based company. It uses the technology with its Criterion Online Writing Evaluation Service for grades 4-12.
Through the years, the capabilities changed substantially, evolving from simple rule-based coding to more sophisticated software systems. And statistical techniques from computational linguists, natural language processing, and machine learning have helped develop better ways of identifying certain patterns on paper.
But challenges remain in coming up with a definition that is universal of writing, as well as in training a computer to comprehend nuances such as for instance “voice.”
Over time, with larger sets of data, more experts can identify nuanced aspects of writing and enhance the technology, said Mr. Williamson, that is encouraged by the new era buy essays online of openness concerning the research.
“It really is a topic that is hot” he said. “there is a large number of researchers and academia and industry looking into this, and that’s the best thing.”
High-Stakes Testing
Along with with the technology to improve writing in the classroom, West Virginia employs software that is automated its statewide annual reading language arts assessments for grades 3-11. The state spent some time working with CTB/McGraw-Hill to customize its product and train the engine, using 1000s of papers it offers collected, to score the students’ writing according to a prompt that is specific.
“We are confident the scoring is quite accurate,” said Sandra Foster, the lead coordinator of assessment and accountability within the West Virginia education office, who acknowledged skepticism that is facing from teachers. But many were won over, she said, after a comparability study indicated that the accuracy of a teacher that is trained the scoring engine performed much better than two trained teachers. Training involved a few hours in how exactly to gauge the writing rubric. Plus, writing scores have gone up since implementing the technology.
Automated essay scoring can also be used on the ACT Compass exams for community college placement, the brand new Pearson General Educational Development tests for a high school equivalency diploma, along with other summative tests. However it has not yet been embraced because of the College Board for the SAT or the ACT that is rival college-entrance.
The 2 consortia delivering the new assessments under the most popular Core State Standards are reviewing machine-grading but have not dedicated to it.
Jeffrey Nellhaus, the director of policy, research, and design for the Partnership for Assessment of Readiness for College and Careers, or PARCC, really wants to determine if the technology are going to be a fit that is good its assessment, plus the consortium should be conducting a research predicated on writing from its first field test to see how the scoring engine performs.
Likewise, Tony Alpert, the principle operating officer for the Smarter Balanced Assessment Consortium, said his consortium will assess the technology carefully.
Together with his new company LightSide, in Pittsburgh, owner Elijah Mayfield said his data-driven way of automated writing assessment sets itself apart from other products on the market.
“What we want to do is build a method that instead of correcting errors, finds the strongest and weakest sections of the writing and where you can improve,” he said. “It is acting more as a revisionist than a textbook.”
The software that is new which will be available on an open-source platform, is being piloted this spring in districts in Pennsylvania and New York.
In higher education, edX has just introduced automated software to grade open-response questions for use by teachers and professors through its free online courses. “One regarding the challenges in past times was that the code and algorithms are not public. They were viewed as black magic,” said company President Anant Argawal, noting the technology is within an experimental stage. “With edX, we put the code into open source where you are able to see how it really is done to help us improve it.”
Still, critics of essay-grading software, such as Les Perelman, want academic researchers to own broader usage of vendors’ products to judge their merit. Now retired, the former director regarding the MIT Writing Across the Curriculum program has studied some of the devices and managed to get a high score from one with an essay of gibberish.
“My main concern is so it does not work properly,” he said. Whilst the technology has some use that is limited grading short answers for content, it relies an excessive amount of on counting words and reading an essay requires a deeper standard of analysis best done by a person, contended Mr. Perelman.