Friday, September 24, 2004

Searching for Math Equations and Symbols on the Web: Part 2

Ok, so I talked about searching for math represented on the web using entity and code values (those starting with an & and ending with an ;). What about pages like this one, just mentioned in the new edition of The NSDL Scout Report for Math, Engineering, and Technology v3 n20 (September 24, 2004)? If you "view source", you'll see that they use .gif files for the math. This particular page uses the word curl several times, but there's no alt for the image, so there's less information for an image search engine to use. Don't even try to search for curl, but if you try grad OR gradient in Google images, on the second page of the results, you'll get someone's handwritten notes. The collection of images in Yahoo seems much smaller - you do get handwritten notes if you enter curl math. If you call the symbol I've been calling grad by it's more generic name "del", you'll actually get a couple of equations, interestingly enough. Google yields a few results with del math. See one interesting one: . AltaVista is apparently still using it's own technology and collection, but the results are not much better.

So, what about the new(er) image searches that analyze the picture itself instead of just the context and alt text? Those that do feature extraction? Can they pick out an integral? or a gradient? The one at Columbia I knew about a while ago is apparently down (websEEk). The one at Penn State by Wang and Li runs only on a practice set of images - and they're from a cd of clip art. (Actually I'm having a hard time finding any of these so I just posted a request to a list for help.) (found via SearchEngineWatch, I don't know if it's content based)
This one does better. If you cut and paste the ∂ or the ∇, it actually comes up with mathematical equations. Now, what if we try ∂x/∂t ? Nothing, but ∂x does yield ∂x . It's a start but I 'm not going to try to narrow the results!

I'll add to this (hopefully) when I find some other engines.

