COMMENTS:
Amir
SUMMARY:
This paper discusses a method of using the entropy rate (or the degree of randomness) of a stroke to distinguish between text and shape. The authors created an entropy model by selecting a set of alphabetic symbols and assigning a range of angles of the temporally ajoining points in the strokes. With the assumption that text symbols will be drawn inquick succession, a time threshold was also established in order to group together strokes belonging to text. When tested on free body diagrams, this method classification accuracy was 92.06%.
DISCUSSION:
This is a very interesting paper that presents a solution to the problem addressed in the Plimmer paper. However, this approach uses one feature to distinguish between text and shape instead of many and maintains a comparable accuracy rate. I would have like to know how and why the symbols A-F and X were chosen for the alphabet as opposed to other letters or symbols.
Thursday, October 21, 2010
Reading #13. Ink Features for Diagram Recognition (Plimmer)
COMMENTS:
SUMMARY:
This paper discusses a method of using ink features to distinguish between text and shapes in a diagram. The authors used tree-based partitioning to identify the significant stroke features out of 46 possibilities. These features are used to identify the stroke as part of either a shape or text. When compared to the Microsoft divider and the divider in the sketching tool, InkKit, this new divider had the lowest misclassification rates for both shapes and text.
DISCUSSION:
This is a very valuable contribution in terms of text identification as I have not read very many papers on the subject. I don't know if I would go so far as to call this text recognition as the authors do because it appears that this work does not address any meaning assigned to the text. However, I agree that accurately differentiating between text and shapes is the first step toward text recognition.
SUMMARY:
This paper discusses a method of using ink features to distinguish between text and shapes in a diagram. The authors used tree-based partitioning to identify the significant stroke features out of 46 possibilities. These features are used to identify the stroke as part of either a shape or text. When compared to the Microsoft divider and the divider in the sketching tool, InkKit, this new divider had the lowest misclassification rates for both shapes and text.
DISCUSSION:
This is a very valuable contribution in terms of text identification as I have not read very many papers on the subject. I don't know if I would go so far as to call this text recognition as the authors do because it appears that this work does not address any meaning assigned to the text. However, I agree that accurately differentiating between text and shapes is the first step toward text recognition.
Thursday, October 14, 2010
Reading #12. Constellation Models for Sketch Recognition. (Sharon)
COMMENTS:
SUMMARY:
This paper gives an overview of a constellation model of sketch recognition that uses stroke structure and distance to other known parts in order to identify objects. The strokes are labeled using a label assignment matrix and the remaining strokes in the sketch are identified using stroke label and interaction likelihoods. This method was tested on 5 classes of objects which included sketches of faces and airplanes, etc. Using a multipass threshold technique the recognizer completed it's recognition in less than 2 seconds.
DISCUSSION:
The constellation model assigns possible labels to strokes based on the features of strokes and pairs of strokes. While this work seems to be a novel approach to recognizing a small dataset with a small number of examples, I don’t see how this method of recognition is best applied to objects such as faces which can vary drastically. I also would have like to seen some explanation of how successful the recognizer was.
Reading #11. LADDER, a sketching language for user interface developers. (Hammond)
COMMENTS:
SUMMARY:
LADDER is a sketch description language that describes how shapes are drawn and edited. This paper gives an overview of LADDER and shows how it can be used to develop a sketch interface. LADDER relies on both soft and hard constraints to facilitate recognition; soft constraints can include factors such as drawing order whereas hard constraints can be geometric constraints.
Shapes are categorized using the subcomponents of the shape, its geometric constraints, aliases for the parts of the shape, editing gestures and display methods. The hard constraints areas are the basis for the predefined constraints used to create LADDER.
DISCUSSION:
This gives a good explanation of LADDER and how it uses constraints to create shape definitions in order to classify strokes as specific shapes within a domain. I think anyone who’s programmed (or completed the last assignment for this class) could appreciate a language that can classify stroke relationship or characteristics in simple text such as “line1 intersects line2”.
One challenge to LADDER that I found interesting was the task of trying to identify incomplete shapes on the screen while the sketch is still in progress. Depending on how complicated the sketch is, unrecognized strokes can slow down the recognizer. I would think that most people would want to complete a shape they’re working on before moving on to another one. Instant feedback showing that a stroke is unrecognized would discourage the user from moving on until it’s complete.
Wednesday, October 13, 2010
Reading #10. GRAPHICAL INPUT THROUGH MACHINE RECOGNITION OF SKETCHES (Herot)
COMMENTS:
SUMMARY:
This paper describes a set of FORTRAN programs created for sketch recognition called the HUNCH system. The programs work together to perform difference interpretations of the parts of a sketch. The STRAIT program uses the minima of the speed function to locate the corners in a sketch. Anything that shows a gradual increase or decrease in speed was passed onto the CURVIT program to further refine the results.
HUNCH faced certain challenges such as latching which is the process of joining endpoints that are in close proximity of each other. The latching in the STRAIT program was improved and resulted in STRAIN. The authors discuss other challenges to interpretation such as double lines. In application, HUNCH uses a context-free data structure to assign meanings to the parts of a sketch.
DISCUSSION:
The programs presented in this paper were a significant contribution to sketch recognition at the time they were developed. I haven’t come across any works prior to this that feature corner finding methods as successful as the one presented here (someone please correct me if there’s another paper out there that I’m not considering).
The approach presented for an interactive system seems to suggest a certain level of machine learning through input from the user. However, I think the authors could have mentioned how much input from the users was required for the system to make a successful interpretation.
On a side note, I thought the text in the paper was difficult to read. It may have been the quality of the file itself, I don’t know.
Sunday, October 10, 2010
Reading #9. PaleoSketch: Accurate Primitive Sketch Recognition and Beautification (Paulson)
COMMENTS:
SUMMARY:
This paper discusses PaleoSketch, a low-level sketch recognition system that recognizes primitive shapes such as lines, polylines, arcs and curves. PaleoSketch does pre-recognition by resampling and removing duplicate points. In addition to the usual stroke features (stroke direction, speed, curvature, etc.) the authors introduce two additional features: normalized distance between direction extremes which is the stroke length between the point with the highest direction value and the point with the lowest direction value, and the direction change ratio which is the maximum change in direction divided by the average change in direction.
PaleoSketch assigns a set of conditions to each primitive and determines which shape is the best fit. The authors go on to describe the conditions of each primitive shape. PaleoSketch has an accuracy rate of 98.56%.
DISCUSSION:
I think the first programming assignment is a great introduction to PaleoSketch and the sketch recognition problem. Before I read this paper, I had kind of a fuzzy understanding of what strokes represent primitive strokes. While I think this is still somewhat arguable, I believe the authors have chosen a good set of primitives to recognize. So far, I have not seen a more complicated primitive in the data that was provided for the assignment.
Using the TestApplication, you can see how the recognizer determines a line is a polyline, but apparently this condition is easy to fool. I don’t believe the system can recognize polylines consistently if the change in direction is minimal. Perhaps in this case speed should be carefully weighed.
Subscribe to:
Posts (Atom)