Today's show is brought to you by the letters...: September 2010

Saturday, September 18, 2010

Reading #8. A Lightweight Multistroke Recognizer for User Interface Prototypes (Anothony)

COMMENTS:

SUMMARY:

This paper discusses $N, an extension of the $1 recognizer created by Wobbrok. The $N recognizer has the following enhancements:

- recognizes gestures comprised of multiple stroke

- generalizes from one multistroke to all multistrokes

- recognizes one-dimensional gestures

- provides bounded rotation variance

The authors give an overview of the $1 recognizer and then explain the enhancements of the $N recognizer and how they are implemented. The $N recognizer was 96.6% and 96.7% accurate on algebra symbols and unistroke collection respectively.

DISCUSSION:

The $N recognizer is an impressive enhancement to $1, and 240 lines of code, it is still fairly lightweight. I like how the authors presented both the strengths and the weaknesses of the recognizer in detail. I believe that the segmentation issue which was not addressed by the recognizer could be handled by input from the user and would not present a huge hindrance to the effectiveness of the recognizer.

My only concern was the training method; 15 training examples were used for the algebra symbols, however only 9 example were used for the unistroke gestures. Given that the recognition accuracy rate only differed by .1% the number of examples did not have a significant affect, but I wonder why the training examples were not kept consistent in order to eliminate the number of examples as a factor in the performance difference. Also, I felt that the authors should have tried to match as many elements of the $1 testing as possible (i.e. testing with adults instead of students).

Wednesday, September 15, 2010

Reading #7. Sketch Based Interfaces: Early Processing for Sketch Understanding (Sezgin)

COMMENTS:

Yue Li

SUMMARY:

Sketch Based Interfaces paper discusses the authors effort to create a program that can understand free-hand sketches. The goal of this work is to provide a natural means of interaction that closely mimics pen and paper and to recognize a wide range of shapes.

The sketch processing entailed stroke approximation, beautification and recognition. Stroke approximation consisted of identifying the vertices and segments of the stroke, beautification was involved making minor adjustments to the strokes to make curves and lines look neat, and the last part of the process was recognizing basic shapes.

DISCUSSION:

This paper reminds me of Wobbrock’s in the sense that it seems as if the authors are trying to create a free-hand sketch recognition tool that can be incorporated into an interface. I like their 3-step approach, however, I was a bit skeptical about the effectiveness of the vertex detection method. When selecting a threshold for the curvature graph, it would seem that a combination of curves and line segments would present a problem. Also, the authors state that the system’s approximation of shapes had an accuracy rate of 96%, but I don’t think the paper stated whether or not the test shapes were created by the users. Did anyone else see otherwise?

Monday, September 13, 2010

Reading #6: Protractor: A Fast and Accurate Gesture Recognizer (Li)

COMMENTS:

Sampath Jayarathna

SUMMARY:

This paper discusses Protractor, a gesture recognozer than is fast and requires little memory. The author begins by comparing template-based and parametric-based approaches to sketch recognition and notes the difficulties with both. Protractor uses a nearest neigbor approach that performs preprosessing to convert each gesture into a uniform vector representation. When compared to the $1 recognizer, Protrator resulted in a similar error rate but a faster processing time.

DISCUSSION:

Protractor is similar to the $1 recognizer except for its method of removing orientation noise. I can appreciate the fact that Protractor is fast and suitable for implementing on mobile devices. However, I wonder if the simplicity of the $1 recognizer doesn't still make it the more appealing choice.

Reading #5: Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes (Wobbrock)

COMMENTS:

Crazy Chris

SUMMARY:

This paper describes the $1 gesture recognizer. The authors compare the $1 recognizers performance to that of Rubine's classifiers and Dynamic Time Warping (DTW) algorithms. In developing this recognizer, the authors sought out to create something that was resilient to variations in sampling, supported position variance, simple, easily written, fast and produced results comparable to existing recognition algorithms.

The $1 algorithm involves 4 steps: Resample the Point Path, Rotate Once Based on the “Indicative Angle”, Scale and Translate, and Find the Optimal Angle for the Best Score. Testing of the $1 recognizer resulted in 97% accuracy with one loaded template and 99.5% accuracy with 3+ templates. However one weakness of the algorithm is that it cannot distinguish gestures based on orienation or aspect ratio (i.e. it cannot tell the difference between a circle and an oval shape).

DISCUSSION:

Compared to most sketch recognition algorithms, the $1 recognizer gets the most bang for the buck (no pun intended...ok maybe a little one). The paper gives a nice explanation of the each of the four steps performed by the recognizer which correspond well to the pseudocode in the appendix. One thing that was a little confusing was the discussion the affects the gesture speed can have on recognition. The authors mentioned that Rubine's best results were obtained when the subjects' gestures were of medium speed. I was curious to know what the speed range was for the fast, medium and slow categoies in milliseconds; I don't think table 1 in the paper was very clear about this or maybe I misread it.

Thursday, September 9, 2010

Reading #4: Sketchpad: A Man-Made Graphical Communication System (Sutherland)

COMMENTS:

The Amazing Drew-Drew Logsdon!

SUMMARY:

This paper describes Ivan Sutherland’s Sketchpad system, the first pen-based computer system. Sutherland begins by describing an example of how sketchpad is used. A light pen, a box of buttons and a bank of toggle switches are used to create a simple shape drawing. The shape can then be printed or “inked” on paper using a PACE plotter.

Sutherland gives an overview of how sketchpad can be used to alter and/or move shapes on the display. Sketchpad features the following capabilities: subpicture (an image made up of smaller images repeated), constraint (relationships between lines) and definition copying (copying attributes of one shape to another). Sutherland discusses some possible applications for sketchpad including creating circuit diagrams or highly repetitive drawings.

Sutherland then goes into the details of how Sketchpad works. It uses a ring structure of pointers to keep track of elements. The light pen’s optics are used to identify spots within the pen’s field of view on the display. Spots are tagged with the address of the element they represent. The tags also identify elements in view of the light pen. Sutherland explains how various elements are created in the display. For example, lines and circles are generated using difference equations. Text is added to the display from special tables containing the line and circle segments used to create the letters and numbers.

Recursion is used to implement many of Sketchpad’s functions such as deletion and merging. The draw and copy functions are the result manipulation with the light pen in conjunction with the button box which creates the ring structure of parts required for the drawing.

Finally, Sutherland discusses some of the drawings that can be done using Sketchpad, such as patterns which show shapes repeated many times and dimension lines that show absolute scale. He also mentions how it can be used to create non-technical, artistic drawings. The paper’s future work includes improvements to sketch pad that would allow conversion from photographs to line drawings and the application of one shape’s attributes to another.

DISCUSSION:

I think the level of detail Sutherland goes into when describing the introductory example goes to show just how novel the pen-based interface was at that time. Many of the functions that Sutherland described involved interactions that we may consider fairly intuitive today. I also thought it was interesting how Sutherland made a point of describing behaviors that we may take for granted in today’s drawing applications. For example, having line segments of a shape translate as a unit when the shape is moved.

Some areas where I thought there might be some difficulty was in the function of the light pen. Since the system determines the pseudo location of the pen when it’s aimed at a part of the drawing, how does it distinguish the selection of parts that are close in proximity? Maybe there was a button for that too?

One thing I found very interesting was the fact that the hexagonal lattice design would have taken almost two days to create using drafting prior to Sketchpad. And, I was really curious about what that PACE plotter looked like, but I couldn’t find an image of one online. Does anyone else have a reference?

Tuesday, September 7, 2010

Reading #3: “Those Look Similar!” Issues in Automating Gesture Design Advice (Long)

COMMENTS:

Francisco Vides

SUMMARY:

This paper gives an overview of the Quill, a gesture design tool that allows developers to build applications using gesture recognition. The authors, Long, Landay and Rowe explore ways of improving Quill by providing feedback to users in order to help make their gestures better. In this case they experiment with giving advice to users when analysis of gestures determines that two gestures are too similar to be distinguished by people or the computer.

The authors begin by describing the study conducted to determine the criteria used to judge gesture similarity. Participants were asked to judge the similarity of gestures in three experiments: one and two involved selecting the most dissimilar gesture from many groups of three gestures, the third involved judging the similarity of pairs of gestures.

The overview of the experiments is followed by a description of quill and the gesture design process. Similarity metrics are used to analyze the gestures created to determine whether or not gestures are similar and possibly confusing. Advice for the user is then delivered in the form of unsolicited warnings.

Advice timing and content, as well as issues related to background analysis are some of the challenges encountered when the advice feature was implemented. The authors ultimately decided to allow actions to continue during gesture analysis and cancel analysis when an affecting change occurs.

DISCUSSION:

Providing feedback to help users make their designs less similar seems like a very useful feature to add to the quill application. These types of user aids are prevalent in many development environments today. However, I can see how the authors would have difficulty determining when and how to display helpful alerts, especially when they’re unsolicited. I would imagine that these types of alerts would be more annoying than helpful in cases where the information is wrong or unwanted. I think allowing some user control over feedback and when it’s delivered could also benefit this feature. This may have been alluded to in the future work and conclusions, but it was not completely clear to me.

Reading #2: Specifying Gestures by Example (Rubine)

COMMENTS:

Jonathan Hall

SUMMARY:

This paper describes gesture recognition technology. GRANDMA (Gesture Recognizers Automated in a Novel Direct Manipulation Architecture) is a toolkit for developing gesture interface and was used to create a gesture-based drawing program (GDP).

Rubine begins by explaining how the GRANDMA toolkit was used to create new gestures and handlers for a GDP. Both the GDP interface and application were developed using the GRANDMA toolkit. The user created a new gesture class using the gesture designer; 15 training samples provided sufficient variance in the structure of the gesture.

Rubine goes onto discuss the 13 features used for gesture recognition and how these features are used to classify gestures. Each gesture class has weights and in terms of training, the goal is to determine the weight of the sample gestures used. If a gesture is evaluated to more than one class, then the gesture is determined to be ambiguous and is rejected.

Rubine also briefly mentions GDPs developed using Eager Recognition and Multi-finger recognition as extensions. The eager method recognizes gestures as soon as they are unambiguous, and multi-finger recognizes gestures made with multiple fingers simultaneously.

DISCUSSION:

GRANDMA sounds like an awesome toolkit for creating gesture recognizers. After reading this paper, it’s clearer to me how gestures are being used to not only create shapes and sketches, but to perform certain actions in the GDP. However, I was a bit confused by the stroke manipulation phase. I don’t know if it was clear whether or not gestures for manipulation are classified same as other high-level operation gesture.

I find the simplicity of the 13 features used for recognition very appealing, but I still believe that they are explained better in Hammond’s gesture recognition chapter. Also, Figure 6 in the paper should have been divided up to show each feature clearly, but I imagine that would make for a very long paper.

Thursday, September 2, 2010

Reading #1: Gesture Recognition (Hammond)

COMMENTS:

Jianjie Zhang

SUMMARY:

This paper begins with a brief description of gesture recognition and how it relates to sketch recognition. It then gives an overview of some foundational methods used for gesture recognition.
Hammond discusses Dean Rubine’s gesture recognition method and gives an explanation of each of the 13 stroke features that Rubine uses to recognize gestures. Hammond also discusses Christopher Long’s work on gesture recognition. Long’s method is an extension of Rubine’s that uses 22 features, 11 of which came from Rubine. Long did not feel that time features contributed to recognition and therefore did not include them in his method. Hammond also talks about Wobbrock’s $1 recognizer which uses a template matcher as instead of a feature-based method.

DISCUSSION:

This chapter is a great one-stop reference for a collection of gesture recognition methods. It provides a brief and clear example of each of Rubine’s features. I think the organization of the features along with their descriptions are a little easier to read here than in the Features section of Rubine’s Specifying Gestures by Example paper, but maybe that’s just me.
I kind of like the update notes that Aaron added (although some of them were repeated), particularly the one about how higher sampling rates can cause problems when the rotational change and smoothness features of the stroke. However, I don’t understand how deleting consecutive points isn’t accomplished anyway by resampling. Also, I was confused by some of the figures in the paper, but the ones that fit with the explanations were very helpful.

Wednesday, September 1, 2010

2. E-mail address: dcummings@cse.tamu.edu

3. Graduate standing: 3rd year PhD

4. Why are you taking this class? I’m exploring new research areas and want to get a good understanding of sketch recognition.

5. What experience do you bring to this class? I’ve attempted to perform sketch recognition using tools in geographic mapping software.

6. What do you expect to be doing in 10 years? The same thing I do every night Pinky…

7. What do you think will be the next biggest technological advancement in computer science? Hopefully something I’m working on. If not, I hope the holodeck will be the next big technological advancement or at least something like it.

8. What was your favorite course when you were an undergraduate (computer science or otherwise)?Computer Graphics. The teacher was a really funny guy.

9. What is your favorite movie and why? Night of the Living Dead (1968) because I love classic movies and because I’m too scared to watch it alone, but I love to try.

10. If you could travel back in time, who would you like to meet and why? My mom and dad, just so I can make sure they fall in love at the enchantment under the sea dance and live happily ever after, thus guaranteeing my existence.

11. Give some interesting fact about yourself. I hate the taste, sight and smell of peanut butter.