I woke up this morning to see that another couple of GEDCOM uploads had hit my particular area of the tree.
After merging for a while, I'm pretty sure I've seen this before - it's a GEDCOM that I've already merged; the spelling mistakes and odd choices are all in the same places.
Wouldn't it be wonderful if Geni could checksum GEDCOM files (or parts of GEDCOM files, possibly after throwing away information that Geni can't use anyway, like the specific IDs of nodes in the tree), and report back "I threw away this branch of your upload and connected it somewhere - this has been uploaded and integrated before"?
Totally agree. Using checksums is a bit shaky though, because they often rely on having the same information in the files. Most user I know of get their GEDCOMs from external sources and add their own info using genealogy software before throwing the GEDCOM into Geni and other online collaboration sites. That way, it is almost impossible to rely on checksums for merging if one doesn't separate individuals.
However, having a better interface for merging - like merging obvious candidates automatically (typically where some important fields do match exactly, including ancestors' names and such) - is welcome.
Yet my experience with introducing new people to Geni who've already dabbled in any sort of computer-based genealogy is "how do I upload my GEDCOM".... people *hate* having to retype their own work, and for the stuff they did themselves, I can't blame them. It's the fact that their export will also include the random GEDCOM information *they* imported that creates the mess.
Sigh. No perfect solutions.