Royal Genes


Safe For Kids





Error rate on FreeBMD?



Fri, 8 Dec 2006 12:29:24 -0000 soc.genealogy.britain
previous


Will J...


Jeff...
Aside from the fact that it would catch transcription/typing errors,
which I would have thought desirable!

Will J...
Surely the most desirable double keying is from one person looking at a
FreeBMD scan and the other looking at a microfiche or the original register
at the FRC?

Dave Mayall...
Yes, and in many cases that is what we do.

However, the primary purpose of double keying is to catch keying errors.


joe.wakefield...
Just to add my 2d, here are scans of the same part of a document.
One is grayscale, the other is B/W.
Notice the 4th entry down. Anyone transcibing this from b/w would reach a
false conclusion.

singhals...
Well, maybe it's me, but I see the same thing in both; the
Xo.24 and v.318 is clear in both scans. One of them does
seem to have a cross-out, followed by an insertion, followed
by an erasure. I'm not sure that's important enough to need.

Will J...
See joe's response - Dave's already admitted greyscale is better so I don't
think there's any need for people to go out and defend black & white images!
I'm not sure if this is an unusual case but I've just finished correcting
the transcribed entries on this page
and I think about 80% of the entries had errors in all fields including the
surname. As the quarter was supposedly complete this may have led some users
to assume that an entry did not exist- the lesson is if you don't find
something on FreeBMD look at the scanned pages on Ancestry.co.uk!

Will J...


singhals...
Oh for PITY's sake. *I* admit greyscale is generally better
for most purposes. What I SAID was, I don't see an
appreciable difference in the material.

Astral Voyager...
I don't recall that anyone asserted that B/W was better than Greyscale -
Will just seemed to think they did. The original complaint was that Ancestry
images were always better than FreeBMD's for the purpose of transcribing the
indexes.

I argued that was not the case and for transcription purposes, which FreeBMD
images are intended for, I found I had more success (i.e. less uncertain
characters) reading FreeBMD's scans. There are legal reasons why scans taken
directly from Ancestry cannot be used to transcribe for FreeBMD anyway.

Hugh Watkins...
AV is being silly because it is anonymous and bored

Astral Voyager...
Thank you for that considered and insightful contribution to the
discussion - NOT!



Astral Voyager...
Surely you read Ancestry's terms and conditions before you signed up?

Hugh Watkins...
use common sense

ancestry donates thousands to freebmd and shares the results

Dave Mayall...
They do.


so they benefit if you use their scans to transcribe

Dave Mayall...
Which we do, under the terms of an agreement with them.

FreeBMD policy is that transcribers must not use scans from the
Ancestry site.

Will J...
Dave, are you saying that the Ancestry images couldn't even be used by
transcribers to double check? Would they be allowed to double check using
microfiche at their local library/record office or the books at the FRC?


Will J...
But you will accept corrrections from submitters who have viewed Ancestry
scans?


Hugh W


Will J...
Are you really suggesting that Ancestry would stop users using Ancestry
images to transcribe for a project that shares all its data with Ancestry? I
think Dave would have pointed this out already if it were the case.


Will J...
Legal reasons? That's a new one on me- I gave Ancestry as the source for all
my corrections so I suppose Dave will have to delete them all :-((

Dave Mayall...
Not the case.

FreeBMD operates a filtering system on corrections, namely that we will
not even consider a correction request unless the person submitting it
has actually checked in the index themselves.

If we didn't do this, we would be snowed under with demands to alter
the index to match the census, the family bible, or the story that
great aunt Gertrude always told.

If the person submitting the correction assures us that it has been
checked in the index, we will recheck the entry from our own sources
(this may be the original scan, it may be a replacement scan, it may be
on film, or it may be at the FRC).

The decision on whether a correction is accepted is based on sources
that we can legitimately use for transcription. Ancestry images (other
than those that Ancestry have supplied to FreeBMD) are not such a
source.


The thrust of the matter is: if someone is typing (or
OCRing) an index of what's printed in the book, then it is
the darker printed info which matters -- and that's
identical in both scans. You can't input notes,
hand-corrections or the like. So whether you can SEE it or
not is irrelevant.

Will J...
Actually it isn't identical at all that's the whole point of the discussion-
if the original greyscale is not very good quality to begin with i.e. there
is not clear contrast between the text and the area around it, the resulting
black and white image will be much worse and more unreadable as parts of the
text will get merged with the background by the software. A way to improve
images converted to black and white is to increase the brightness a lot and
the contrast a little before converting from greyscale (that of course also
helps make greyscale more readable).


If I _were_ defending B&W scans, exhibit A would be a
document hand-written in liquid ink on a linen-grain sheet
of paper. Exhibit B is the one typed over or under an
embossed pattern in the center of the sheet. Been there,
done that; the Tshirt was shrieking-green.
Thanks for your reply Dave. Was the original digit you refer to handwritten
or typed? In my experience despeckling will only remove very small dots so I

david.mayall...
I would have to wade through an awful lot of stuff to find it. Having
completed the testing, and concluded that the enhanced images did
occasionally suffer from artefacts, the answer became all that was
relevant going forward, so the details were archived away.

doubt that it could really affect the recognition of characters to a trained
eye although someone unfamiliar may conceivably mistake a 1 for a 7 in that
case. If you can only cite a few faults with Ancestry images, I would

david.mayall...
Trust me, people who are good at this got it wrong. It is all a
question of how much you turn up the despeckling. If you tweak it too
far, you start to get artefacts.

reallly have to say this is a plus for those against the non Ancestry
FreeBMD images as I have have found bad characters on almost every scan. In

david.mayall...
This becomes a very philosophical discussion.

Which is better;
a) The scan on which 10% of records cannot be read, and we have to flag
them as unreadable
b) The scan on which all the records can be read, but on which 1 record
is incorrect due to a scan artefact.

I would say that (a) is better, with us stating plainly that this
record can't be read, you would doubtless say (b), but that
transcription includes a record that is wrong.

In any case, as I've already said, we are using Ancestry images
extensively. It's just that we use them without the "enhancement"

my opinion converting FreeBMD images to two colour black & white was a very
bad decision as this almost always results in distortion and complete loss
of characters with sources in a poor condition or using a poor scanning
techique as many people who do OCR on old books will know. The Ancestry
images are in greyscale so no or minimal information is lost.

david.mayall...
Of course greyscale is better!

However, FreeBMD has to exist in the real world, and what was
achievable was governed by what could be afforded, and what
professional companies would donate in terms of scanning.


Will

P.S. Can you tell me the Ancestry scan in question that contains the error
which you described?

david.mayall...
Not without a considerable effort, and I lack the time this month.
next