Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
Databases Programming

Falsehoods Programmers Believe About Names 773

Posted by timothy
from the can't-we-stick-to-slashdot-user-ids? dept.
Jamie points out this interesting article about how hard it is for programmers to get names right. Since software ultimately is used by and for humans, and we humans are pretty tightly linked to our names (whatever the language, spelling, or orthography), this is a big deal. This piece notes some of the ways that names get mishandled, and suggests rules of thumb (in the form of anti-suggestions) to encourage programmers to handle names more gracefully.
This discussion has been archived. No new comments can be posted.

Falsehoods Programmers Believe About Names

Comments Filter:
  • by jra (5600) on Thursday June 17, 2010 @08:59PM (#32609054)

    I found the piece very interesting.

    Though my inability to post this comment appears to have outlived the slashdotting of the site.

  • by Wonko the Sane (25252) * on Thursday June 17, 2010 @09:15PM (#32609136) Journal

    I am fortunate enough to be the child of a professional smart-ass who intentionally gave all his children two middle names so that we would not fit into the computer systems of the era.

    When I grew up my parents used my first middle name as a "given nickname" (it's actually in quotation marks on my birth certificate). So most of the time when I give my name for something I use my "given nickname" as my first name. Unless I feel like using my legal first name as my first name in which case I use that. There are probably four or five different versions of my name attached to my SSN in various different databases.

    I've also got a sufffix: III. I don't have two ancestors with the exact same name as me, but since the various parts come from two different relatives my parents settled on III.

  • by TheLink (130905) on Thursday June 17, 2010 @09:21PM (#32609180) Journal
    I dunno, the guy just lists out reasons why you can't uniquely identify people by names. e.g. "some people don't have names".

    Well that's why Governments start handing out people national ID numbers[1]. Then even if you aren't who you claim you are, at least the poor data entry person has something to key in and can actually type it in on his/her keyboard ;).

    [1] As for foreigners wihtout a passport number or national ID, please wait here for those friendly guys in uniforms...
  • who needs vowels? (Score:4, Interesting)

    by theNAM666 (179776) on Thursday June 17, 2010 @09:24PM (#32609192)
  • Bo3b Johnson

    http://www.linkedin.com/pub/bo3b-johnson/13/846/a52 [linkedin.com]

    The 3 is silent. And no, I don't know him but I know someone who does.

  • by jackb_guppy (204733) on Thursday June 17, 2010 @09:35PM (#32609258)

    Most developers, do not get that the world is made up multiple standards and refuse to consider local vs database relationships like:

    Boolean: Yes/Si/... No/No/..

    Amounts: ,. ., 0,2,3 places from right for display, 0,2,3 places from right for value (USD: ,.22, JPY: .,20)
    Dates:: ./ 0 suppressed ISO, JPN, USA

    How do you thing the do working with phone@, addresses and names?

    Database engines fail with these simple-complex constructs because sorting and matching tests are still left hand and character set driven. A database MUST treat all of these names the same: McClean, MacClean, MCLean, Mc Clean, Mac Clean. McCleen, ...

  • by scdeimos (632778) on Thursday June 17, 2010 @10:02PM (#32609392)

    A database MUST treat all of these names the same: McClean, MacClean, MCLean, Mc Clean, Mac Clean. McCleen, ...

    Are you sure? What if "Mac Clean" is actually somebody's first and last names?

    I know plenty of people whose legal name is a single word, such as "Alex", "Max" or "Virgil." Would your system put that in the first_name, middle_name or surname column? Storing names and using them sensibly is hard, as TFA acknowledges.

    You'd think that e-mail addresses by comparison would be simpler, but I have a hard time trying to register my e-mail address with sites that won't allow even simple things like "+", "-" or "." characters in the local part.

  • by Dragonslicer (991472) on Thursday June 17, 2010 @10:06PM (#32609410)

    A database MUST treat all of these names the same: McClean, MacClean, MCLean, Mc Clean, Mac Clean. McCleen, ...

    I assume you left out a "not" in that sentence? I think there are quite a few people that will kindly (or maybe not-so-kindly) explain why "Mc" and "Mac" are not the same.

  • not surprising (Score:3, Interesting)

    by Phoenix Dreamscape (205064) on Thursday June 17, 2010 @10:08PM (#32609432) Homepage

    Considering how many entry forms still don't allow '+' in an e-mail address (or, worse, allow it in the sign-up box but not in the unsubscribe box), and considering how many banks still restrict you to an 8-character password, does it come as any surprise that they have difficulty with something that isn't defined in an RFC [ietf.org]?

  • by Speare (84249) on Thursday June 17, 2010 @10:14PM (#32609452) Homepage Journal
    Love the literary reference. In a much earlier sci-fi story, This Perfect Day [wikipedia.org], every citizen has a nameber, an identifier that is part name, part number. There are only four male names, four female names, and these are combined with a multi-digit code to make the ID unique. Ever since online forums started suggesting logins like "MaryBeth131" I can't help but think of namebers.
  • by shutdown -p now (807394) on Thursday June 17, 2010 @10:43PM (#32609594) Journal

    That said, if your input form doesn't allow some guy to type in his name with tone number suffixes on a US Windows keyboard layout where he lacks access to diacritics, then you're not a very thoughtful programmer.

    Or you code in some language where Unicode support is not there by default, and you have to jump through hoops to get it working.

    Like, say, PHP. Or stable Ruby.

    Which might explain a lot of things about why so much of the Net is largely broken I18N-wise even on the most basic level, come to think of it.

  • by arekq (651007) on Thursday June 17, 2010 @11:09PM (#32609718)

    A similar issue happens with Chinese names.
    Most Chinese people have one word or two word names.
    If a person have a two word name and fill it in in the form: "Chow, Yun Fat", the system likely would take "Yun" as the middle and and "Fat" as the first name, or vice versa, which often reduce the name to "Chow, Yun", or "Chow, Fat".
    One way to reduce this confusion is to use hyphen to join the words, like "Chow, Yun-Fat".

  • by mogness (1697042) on Thursday June 17, 2010 @11:12PM (#32609734) Homepage
    If you are a guy (not an unreasonable assumption on /.), I think it's really strange that online forums are suggesting you the name "MaryBeth131"
    What were your parents thinking?
  • Re:Well Duh (Score:3, Interesting)

    by Eskarel (565631) on Thursday June 17, 2010 @11:21PM (#32609786)
    1. Don't use names as a unique identifier, they're not.
    2. Cher has a last name, as most likely did Homer and Virgil and everyone else, they're last names might have been "from _____ or the ______", but they still had one.
    3. It's illegal to use SSN as a unique identifier, so don't use it as one.
    4. Who cares, don't muck around with case, and search case insensitive, more matches are better than not enough.
    5. There are conventions to get around that in ASCII, but unicode solves most of it anyway.
    6. Always properly encode and decode your data to meet the requirements of your medium.
    7. You still have to have name limits, and someone's name will always break it, using some ridiculous number of characters in your database is just going to kill your database.
    8. No ones name is a single letter in any language I've ever heard of(a single character, but that's not the same thing), and since names aren't unique or identifying this doesn't really matter.
    9. Who cares?

    Names are not meaningful except to the people who have them, and they're deluding themselves. You are not your name, and your name is not you.

  • by Anonymous Coward on Friday June 18, 2010 @12:00AM (#32609956)

    >A database MUST treat all of these names the same: McClean, MacClean, MCLean, Mc Clean, Mac Clean. McCleen, ..

    Given that you used upper case for the "MUST", it would be nice if your statement wasn't total garbage. McClean and MacClean aren't the same surname. You do know that, right? They sound the same, but they are actually different. I am an HR systems developer and I can tell you that people get very upset when people spell their names incorrectly, including using incorrect capitalisation.

    If you had stated that some kinds of searches should return names with spellings that are similar to (or sound like) the specified search criteria, you might be kind of on the right track, but to say they must be treated the same is patently utter toss. When you have 40,000 employees, you are likely to have multiple people with very similar names. I'd kind of like not to treat them as if they are the same person and if I'm searching for one, I don't always want to have to put up with a bunch of inexact matches just because some bozo who has no idea has decided they all need to be treated as being the same.

  • by Malc (1751) on Friday June 18, 2010 @12:42AM (#32610144)

    A lot of mobile phones, including my Samsung phone, use Pinyin as a way of entering Chinese characters. For each word/syllable I enter, there's a sometimes long list of matching Chinese characters to select from.

    Pinyin is also used on things like street signs in some of the larger cities, which gives Western people at least some chance of recongnising names.

  • by vux984 (928602) on Friday June 18, 2010 @12:43AM (#32610148)

    "Bo3b"

    Never seen that one but I've heard of a: !bo

    The leading exclamation is apparently a...lol i dunno what its called, but its apparently one of the hollow popping/clicking sounds you see in some African languages.

  • Re:Article text (Score:3, Interesting)

    by bertok (226922) on Friday June 18, 2010 @12:58AM (#32610216)

    Reminds me of a classic database developer nightmare story that I heard:

    A local school was receiving complaints that two students were getting the exam results and the like mixed up.

    The two students? Identical twins living in the same house, with the same name.. John Smith Jnr.

    Apparently their father was John Smith Snr, and the whole "Senior / Junior" thing has been done for generations of "Johns Smiths", and it was a tradition and all, and we can't just break a tradition just because we had twin boys.. so... we'll name them both John Smith Jnr.

  • by snowgirl (978879) on Friday June 18, 2010 @01:37AM (#32610338) Journal

    Funny, I actually use the Chinese IME on Windows... it is called "Chinese (simplified) - Microsoft Pinyin - New Input Style"

    And I do actually type in characters using Pinyin, because they have adaptive algorithms that guess at what the most likely character to follow is. They guess well, but it also displays 9 choices at a time, that you select with number keys.

  • by Anonymous Coward on Friday June 18, 2010 @02:03AM (#32610404)

    http://en.wikipedia.org/wiki/Perri_6

    This is how you become top listed in every citation index.

  • by JaredOfEuropa (526365) on Friday June 18, 2010 @02:05AM (#32610412) Journal
    I have an apostrophe in my surname, and you'd be surprised at how many systems break when I try to enter it... even in this day and age where character escaping and scrubbing for SQL are readily available in most languages, often even in the standard libraries. And you'd be surprised at how many systems return a response that hints at something like that cartoon being possible...

    Even worse are the systems that seem to accept the response, then break down internally. I've had some bitter arguments over reservations at car rental and airline check in counters.
  • I didn't understand (Score:5, Interesting)

    by SimonInOz (579741) on Friday June 18, 2010 @02:56AM (#32610574)

    I though the article was about the inability of programmer to remember names and recognise people, Maybe I should have read the article.

    It's a real problem though - is it just me? I often know things about people (ah yes, plays squash, good at making cakes, father of that kid who rides a unicycle), but their actual name - no. It's a miracle if I recognise them at all.
    Mind you, it means if anyone says "Hello" to me, I am obliged to be polite to them as I might actually know them quite well, but haven't recognised them yet - and certainly don't know their name.

    It's a right pain. Anybody else suffer from this - and what the heck do they do about it? (I'd like a camera attachment what would whisper in my ear "that's Mrs Jones, her daughter, Kira is in the same class at school as your daughter. Likes chess and is obsessed with kayaking" - something tiny that could clip on my glasses, maybe).

  • by Hognoxious (631665) on Friday June 18, 2010 @03:11AM (#32610610) Homepage Journal

    Why would you be doing the validation in the database?

    If he'd meant "should treat them all as valid", then he should have written that.

  • by unkiereamus (1061340) on Friday June 18, 2010 @03:20AM (#32610636)
    I actually knew a girl in HS who came from a very traditional Mexican family, as a result, she had 7 middle names.

    Here's the thing, in California, in order to be issued a driver's license, your full name had to appear on the card, and there was insufficient space for all of her middle names, as a result, in order to get a driver's license, she had to have her name legally changed.
  • by Anonymous Coward on Friday June 18, 2010 @05:01AM (#32611024)

    Because no one ever automated the process of filling out web-forms right?

  • by Sique (173459) on Friday June 18, 2010 @05:29AM (#32611100) Homepage

    To make things worse, it's not necessarily the family name you use to address someone politely.

    If you have to speak to Paul McCartney (of Beatles' fame), you have to formally address him as "Sir Paul". No, "Sir McCartney" is impolite, you shouldn't use it.
    If you have to speak to Vladimir Putin, you won't address him as "Mr. Putin". It's "Vladimir Vladimirovich", please!

  • by Migala77 (1179151) on Friday June 18, 2010 @06:49AM (#32611424)

    Proper email validation is not trivial

    The regular expression, if one must be used, doesn't need to be any more complex than:

    ^[^@]+@[^@]+$

    Actually, the local part of an e-mail address can be a quoted string [ietf.org], containing pretty much any character, so "user@host"@example.org is a perfectly valid e-mail address, and doesn't match your regex. Most systems won't accept it, but it's valid...

  • Re:Well Duh (Score:1, Interesting)

    by Anonymous Coward on Friday June 18, 2010 @07:47AM (#32611752)

    Also, "Teller" (from Penn & Teller) has no first name, which may or may not be an important distinction.

  • by mcgrew (92797) * on Friday June 18, 2010 @07:53AM (#32611802) Homepage Journal

    This one was humorous: "surely people's names are diverse enough such that no million people share the same name"

    Anybody who's ever run a mainframe database knows that's just stupid. Back in 1997 Altavista found six people with my exact full name on the internet. In 1997!

    I hate doing a name lookup on my company's database -- do you have any idea how many people in Chicago alone are named "Johnson"? I once joked that they should rename it Johnson City.

    And in the US, there are people with more than one SSN, and people who have none at all. I know a guy with two SSNs, he somehow got in the middle of a feud between the Outlaws and the Hell's Angels about twenty years ago and a judge ordered him to change both his name and SSN, so it was not only done with the government's bleasing, but on their orders.

What the world *really* needs is a good Automatic Bicycle Sharpener.

Working...