[Chicago-talk] Mechanical Turk

Michael Potter michael at potter.name
Thu Sep 15 14:39:19 PDT 2011


Here are a couple more comments:

Errors are not a big deal.

We already deal with typos in names all the time.

To check, I think I would run twice, if they did not match significantly,
run a third time.

The names are not sensitive.  The stranger would know that somewhere in the
world a person lived named "Ruth Smith".  Not a big deal.  If at some time
in the future someone decides that it is a big deal I will run a HIT for
first name and at HIT for last name.

Anyone know the trick to embedding the image in the HIT?

>From what I read I need to provide a url to the image, but I would rather
have the image embedded in the request.  Seems easier to control security.


On Thu, Sep 15, 2011 at 4:01 PM, Michael Potter <michael at potter.name> wrote:

> yes, we are using tesseract-3.00 for OCR of the computer printed text.
>
> We are going to try to get the tesseract trained to do hand written block
> letters, but I am not holding out a lot of hope that it will work with.
>
> I am researching the next best option which might be the mechanical turk.
>
>
> On Thu, Sep 15, 2011 at 3:26 PM, Joel Berger <joel.a.berger at gmail.com>wrote:
>
>> Have you tried OCRing programmatically?
>> http://search.cpan.org/search?mode=all&query=ocr
>>
>> How have the results been? It seems that if you could eliminate the
>> easy ones and perhaps only shift the problematic ones to mTurk that
>> would be cheaper.
>>
>> Joel
>>
>> On Thu, Sep 15, 2011 at 10:18 AM, Michael Potter <michael at potter.name>
>> wrote:
>> > Perl Crew,
>> > I have been called upon to try to do "OCR" on handwriting.
>> > In particular, I need to convert a hand written name to ascii.  I could
>> > provide a small .tif with just the name in it.
>> > It came to mind that this might be a good use of mechanical turk.
>> > I am sending this to the perl list because I seem to recall some of the
>> > Mongers have worked with mechanical turk.
>> > Here are my specific questions:
>> > 1) how long is typical turn around for a response?
>> > 2) Is this a reasonable task for Mechanical Turk.
>> > I looked at the amazon website for HITs similar to what I am trying to
>> do.
>> >  I did not find any, but I question my ability to search completely.
>>  The
>> > closest I found was business card transcription.
>> > You comments welcome.
>> > --
>> > Michael Potter
>> > Replatform Technologies, LLC
>> > +1 770 815 6142
>> > michael at potter.name
>> >
>> > _______________________________________________
>> > Chicago-talk mailing list
>> > Chicago-talk at pm.org
>> > http://mail.pm.org/mailman/listinfo/chicago-talk
>> >
>> _______________________________________________
>> Chicago-talk mailing list
>> Chicago-talk at pm.org
>> http://mail.pm.org/mailman/listinfo/chicago-talk
>>
>
>
>
> --
> Michael Potter
> Replatform Technologies, LLC
> +1 770 815 6142
> michael at potter.name
>



-- 
Michael Potter
Replatform Technologies, LLC
+1 770 815 6142
michael at potter.name
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/chicago-talk/attachments/20110915/59ebcf30/attachment.html>


More information about the Chicago-talk mailing list