Sketch Recognition

Computer recognition of hand-drawn diagrams

Walk into an Engineering 111 class at Texas A&M University, where students are taking an exam, and you might be surprised to see them answering the questions by drawing engineering diagrams on computers — and getting instant feedback from their computers.

Sketch recognition is the automated recognition of hand-drawn diagrams by a computer, which can be applied to classroom activities such as writing and drawing.

Dr. Tracy Anne Hammond, an associate professor in the Department of Computer Science and Engineering at Texas A&M and director of Sketch Recognition Lab (SRL), teaches sketch recognition and computer-human interaction.

Hammond compared sketch recognition with handwriting recognition.

"Sketch recognition is more complex than handwriting recognition," she says, "in that the sketcher is not constrained by space and time, and there is no existing dictionary for the diagrams you draw. You have to figure out the dictionary. We are talking about mechanical and electrical engineering, and even drawing human faces. Where is the dictionary of things that people draw and has meanings?"

Just like the 26 letters in the alphabet form the building blocks of words and sentences in the field of handwriting recognition, low-level shapes (such as lines, curves, arcs, helixes or spirals) often form the building blocks of diagrams in sketch recognition.

There are only a handful of places throughout the world where sketch recognition research is being carried out, Hammond says. Texas A&M has the largest sketch recognition lab with 18 student researchers, followed by MIT, Hammond’s alma mater.

The first steps
An ardent lover of mathematics, Hammond took a computer science class for the first time in her junior year of college after one of her friends recommended it.

"I came back after the class and thought ‘Oh my god! Computer science is mathematics for people who want instant gratification. I can immediately test and evaluate the algorithms I create.’ Originally interested in neurology or psychiatry, I fell in love with the idea of artificial intelligence and using computers to better understand people by developing algorithms to simulate intelligence," she says.

Her newfound passion for computer science led her to pursue a Ph.D. in computer science (specializing in artificial intelligence) from MIT, after first completing a master’s degree in computer science and another master’s degree in anthropology from Colombia University.

"If you want to be successful you always have to pick something you like and enjoy doing. It has to be part of your playtime, otherwise you can’t accomplish much," Hammond says.

The Aggie connections
Hammond came to Texas A&M in 2006.

"Aggieland is sort of Disneyland for academics," Hammond says. "I was shocked when I came to Texas A&M. Everybody was so happy. I was incredibly intrigued. The department and university are very supportive of new research."

That infectious spirit and atmosphere of support has led to some fruitful collaborations with researchers in other departments at Texas A&M.

In collaboration with Dr. Julie Linsey in the Mechanical Engineering Department, Dr. Erin McTigue in the Department of Teaching, Learning and Culture, and Dr. Anthony Cahill in the Civil Engineering Department, Hammond and students designed and developed the MECHANIX project. The project is funded by Google and NSF, and is currently being used for classroom teaching in the Engineering 111 course to students from a variety of different engineering disciplines, including civil engineering and mechanical engineering. Engineering 111 is the introductory required engineering class for all engineering majors across multiple departments. Students are taught free body diagrams among other things.

With MECHANIX, the instructor types a question, loads an image, and simply draws the diagrammatic answer directly onto the computer. To correctly answer the question, both the student’s diagram and computed force values must be correct. The computer automatically gives immediate, detailed feedback on the student’s hand-drawn solution when requested.

"Because hundreds of students take Engineering 111 course every semester, correcting such hand-drawn problems was extremely time-consuming before," Hammond says. "The number of hand-drawn homework problems was limited, despite the importance of these problems in understanding the material. Additionally, feedback would come days or weeks later."

But now, with MECHANIX, instructors can give the students any number of problems, with the feedback and grades coming instantaneously to the students.

Another project on which two of Hammond’s students, Daniel Dixon and Manoj Prasad, worked is called iCanDraw? It was built in collaboration with Joshua Bienko, a professional artist and an art professor in the Visualization Lab at Texas A&M. He uses some of the project’s concepts in his classrooms.

"The students are brilliant and that is what I love about this University, and that is what drives me to work every day," Hammond says.

iCanDraw? is the first application to use sketch recognition to assist a user in creating a rendering of a human face. The application is intended to improve the user’s ability to draw.

Once an image is loaded into the computer, iCanDraw? performs face recognition on that image and gives a template to the user. The program works with the user to draw the face and uses the template to understand what the user is drawing. This allows the user to apply multiple brush strokes as the user naturally would without having to worry about drawing something with a single stroke. At the end the system performs face recognition to obtain a face template for the final sketch for comparison with the original image template.

"Sketches are inherently messy and abstract, making it otherwise difficult to the concrete original photograph. People tend to draw the perceptually important parts (like fingers), and leave out others. However, with our method, even artful shading or missing elements, the face recognition software can still understand the diagram," Hammond says.

Extra mile
Another application, GeoTrooper, was developed to ease the coordination and reassembly of airborne paratrooper teams. Hammond and her students tested Geotrooper multiple times in 2010 with U.S. Army paratroopers.

GeoTrooper is a location-aware system that uses an ad-hoc Wi-Fi network to broadcast and receive GPS coordinates of equipment and/or rendezvous points. The system consists of beacons (devices attached to equipment and placed at assembly points that broadcast their position using ad-hoc Wi-Fi) and receivers (android phones or other handheld devices that orient the user towards the beacons and any predetermined coordinates).

"After I emphasized that I needed to understand the environment we were developing for as best I can, Lt. Gen. Helmick, commanding general of the 18th Airborne Corps, threw me out of a plane—with a parachute," Hammond says. "My anthropological background always encouraged me to get as close to the research as possible.

"You have to go that extra mile to really produce cool stuff—you need to have lots of those eureka moments."

Teaching children
One of the projects under development in the SRL is aimed at teaching basic skills such as drawing and writing to children between the ages of 5 and 7 years. The project is named TaYouKi, short-form for "teaching assistant for young kids."

SRL researcher Francisco Vides is developing TaYouKi. One of the applications of this project is teaching children to draw human eye on a computer in six steps. With an eye diagram as a template, the child gets instructions on how to draw and feedback after each step. If the child is not satisfied, he or she can go back a step and erase it, or erase all steps and start afresh. The steps essentially teach the child to draw an eye by splitting it into different shapes, such as circle (eye ball) or curve (eye brow).

"I do think there’s a lot of hope using sketch recognition in classrooms for various subjects, which is why I am so excited about the generalizability of the program," Hammond says.