Wild idea: Checksum GEDCOMs?

Started by Harald Alvestrand on 2010年9月7日(星期二)
Showing all 6 posts
2010年9月7日 下午 7点49分

I woke up this morning to see that another couple of GEDCOM uploads had hit my particular area of the tree.

After merging for a while, I'm pretty sure I've seen this before - it's a GEDCOM that I've already merged; the spelling mistakes and odd choices are all in the same places.

Wouldn't it be wonderful if Geni could checksum GEDCOM files (or parts of GEDCOM files, possibly after throwing away information that Geni can't use anyway, like the specific IDs of nodes in the tree), and report back "I threw away this branch of your upload and connected it somewhere - this has been uploaded and integrated before"?

2010年9月8日 上午 3点33分

Yes, I have been merging 5000 profiles from one GEDCOM upload and it takes so much time that I could have used a lot better in here!

2010年9月8日 上午 8点11分

Totally agree. Using checksums is a bit shaky though, because they often rely on having the same information in the files. Most user I know of get their GEDCOMs from external sources and add their own info using genealogy software before throwing the GEDCOM into Geni and other online collaboration sites. That way, it is almost impossible to rely on checksums for merging if one doesn't separate individuals.

However, having a better interface for merging - like merging obvious candidates automatically (typically where some important fields do match exactly, including ancestors' names and such) - is welcome.

2010年9月8日 上午 10点54分

Oh, I wish there was no Gedcom-import possibility at all. It only makes a mess and we have enough of that already.

2010年9月8日 上午 11点11分

Yet my experience with introducing new people to Geni who've already dabbled in any sort of computer-based genealogy is "how do I upload my GEDCOM".... people *hate* having to retype their own work, and for the stuff they did themselves, I can't blame them. It's the fact that their export will also include the random GEDCOM information *they* imported that creates the mess.

Sigh. No perfect solutions.

2010年9月12日 上午 1点59分

Geni is working on creating mechanisms to filter gedcom files that are uploaded. Of course this is a HARD problem to solve. We curators were just discussing this with Noah.

Showing all 6 posts

Create a free account or login to participate in this discussion