Names That Break Computers (bbc.com) 372
Reader Thelasko writes: The BBC has a story about people with names that break computer databases. "When Jennifer Null tries to buy a plane ticket, she gets an error message on most websites. The site will say she has left the surname field blank and ask her to try again."
Thelasko compares it to the XKCD comic about Bobby Tables, though it's a real problem that's also been experienced by a Hawaiian woman named Janice Keihanaikukauakahihulihe'ekahaunaele, whose last name exceeds the 36-character limit on state ID cards. And in 2010, programmer John Graham-Cumming complained about web sites (including Yahoo) which refused to accept hyphenated last names.
Programmer Patrick McKenzie pointed the BBC to a 2011 W3C post highlighting the key issues with names, along with his own list of common mistaken assumptions. "They don't necessarily test for the edge cases," McKenzie says, noting that even when filing his own income taxes in Japan, his last name exceeds the number of characters allowed.
Updated Policy: (Score:4, Funny)
Re: (Score:2)
All users will be assigned Social Security Numbers for standardized interaction with all systems. Thank you for your compliance with this exciting and mandatory efficiency initiative.
FTFY - That "problem" got fixed in 1987 when the IRS required Social Security Numbers to claim children dependents on tax returns. Your tax dollars at work.
http://www.snopes.com/business/taxes/dependents.asp [snopes.com]
Re: (Score:2)
I'm pretty sure your system would reject my german social security number ... for no apparent reason.
Re:Updated Policy: (Score:5, Informative)
After the IRS started requiring social security numbers to claim children dependents on tax returns, about 7 million of them vanished. In this case, it appears that the move was justified. http://www.snopes.com/business... [snopes.com]
Re: (Score:3)
After the IRS started requiring social security numbers to claim children dependents on tax returns, about 7 million of them vanished.
With justifications like this.... it is now far easier to consider things like drug testing welfare recipients.... require voters to have i.d's, and so on.
Re: (Score:2)
Wrong, as with cod labeling all units will be assigned automatic, random, barcodes which are inked into their skin, resistance is futile death will be assigned to all non compliant software, including those with grandfathered unique names and social security numbers.
Re: (Score:2)
Seems like there should be libraries for handling names by now. Most popular languages have libraries for handling time, URLs, regexs, Unicode screw-ups, sanitizing input etc.
Re: (Score:2)
Actually, thinking about it this might be a problem that we can't solve today. For example, some people's names can't be written in Unicode, and even when Unicode does have the necessary characters there is no way to correctly render them for all Chinese, Japanese and Korean people. You have to pick one of the three renderings and Unicode gives you no hint which one.
So the first step is to fix Unicode, which is kind of a massive undertaking.
Re: (Score:2)
In related news, I once had a data download blow because someone was named Nuñez. The original mainframe system had fixed-length fields (as mainframe data often does). However, the original green-screen monitors had been replaced with Windows terminal emulators and someone had entered the actual n-tilde character that would have been impossible on the IBM 3270 US keyboard. So now we have a code that maps from ñ to n~ as it's converted from EBCDIC to ASCII, expanding the length of the field and the
Re: (Score:3)
The characters which Unicode contains are independent of the encoding used to represent them. UTF-8 and UTF-16 can represent the whole (just over 20-bit) range of Unicode codepoints. The two problems described by GPP are unsupported characters [modelviewculture.com] and Han unification [wikipedia.org].
Re: (Score:3)
The issue isn't how many characters exist. There is room to add more characters to Unicode when missing ones are found. The big failure of Unicode is Han Unification, which is basically like saying "Well the character A in America has the same *meaning* as the character B in Canada, so let's only issue one codepoint for A/B" and now when you type an A on your American computer, all Canadian's see a B because their fonts render the exact same character differently. This happened with many common characters t
Re: (Score:3)
95% of Han Unification doesn't seem like a problem to me. The slight stylistic differences between Chinese and Japanese where it's just a matter of "these tiny strokes point slightly left in Chinese and slightly right in Japanese" can still easily be understood no matter what font. Even slightly more stylistic differences don't actually cause any problems. For example, these two Kanji: http://jisho.org/search/%23kan... [jisho.org] and other Kanji that have these shapes inside of them. The fonts tend to show the Chinese
Re:Updated Policy: (Score:5, Informative)
They do exist they are called string parsers.
The real issue is that practically *any* integer could be a valid text character in any given input because of the number of codepages that exist. Then you have to take the trouble of identifying the specific codepage used by the input to know what can be safely excluded. Then you need to deal with non-printable control characters. Which amounts to reading bytecode from the input to make a decision on how to or what to interpret / print as a character. (Example UTF-8: First byte of any character is the number of bytes that compose that character (expressed in bits, and terminated by a zero bit.), unless it's one byte in which case the first bit is zero and the remaining 7 bits are the character data. Misinterpret a bit or get misaligned, and you start interpreting garbage.) Etc.
Add all of this complexity to a short time span to develop libraries, (i.e. it needs to be done three days ago), and minimal budget, ("What do you mean we need to support diacritics? No we're not spending that money to add support for it. Ship the damn thing without it, if they want it they can pay for an upgrade.") and you can see why these problems exist. Mostly it's the idea that the support isn't needed for everyone so they can get away with not implementing it and blame any issues that crop up on the end user / some bug / a bad connection / etc.
Sadly TFA is yet another call to attention for this issue, that ultimately will not be addressed unless it gets "fixed" by an unrelated upgrade / patch being rolled out that just so happens to fix these kind of issues, in addition to whatever the real purpose of the upgrade / patch was.
PS: Read the summary, if "NULL" is considered a valid error result from a string parser, then that parser needs to be rewritten to support proper error codes. Practically anything could be valid input and returning the error status as part of the damn output string is ASKING for trouble. Why? Because then you need a parser to check the error status, so the original parser just made more work for the caller, and guess what? Something tells me the caller didn't check for the EXACT error string correctly, and thus interpreted "Null" as "NULL". Hence the error given to the user.
Interesting read about names (Score:4, Informative)
http://www.kalzumeus.com/2010/... [kalzumeus.com]
Nothing to say, read it.
There is similar stuff about Dates, Time, Time Zones etc. on the internet. I should make a collection of it.
But I can't figure how to write into my /. journal nor how to use the old /. bookmark feature.
Re:Interesting read about names (Score:4, Informative)
When it ends in PORN... (Score:2)
Re: (Score:2)
A user account feature of a company I worked for tried to censor names. Any name with any censored word buried in it got turned away. Mr. Brass and Mrs. Lassiter, we never got to serve them.
Re: (Score:3)
Mr. Brass and Mrs. Lassiter, we never got to serve them.
I had a Chinese-American teacher called Mr. Fuch. The other teachers had a hard time trying not to mispronounce his last name. They all fucked it up.
Re: (Score:2)
I have this problem sometimes. My kids have my last name, Kass, and several of the new interactive things at Disney World a few years ago refused to accept their names, or later refused to accept my email address as a place to email their creations.
Re: When it ends in PORN... (Score:2)
That one took an extra-special exercise in stupidity to pull off.
Re: (Score:3)
Re: (Score:3)
Was more likely a Thai lady than an Indian.
"Pon" or "Porn" is a common last syllable in Thai, for given names as well as family names.
Perhaps she was Indian by birth but Thai by ancestry?
Re: (Score:2)
Perhaps she was Indian by birth but Thai by ancestry?
Not sure how she came about the last name. Intuit had her stuffed into a small conference with 30 other Indians. They're all talking to each other while working on their laptops, which help desk couldn't fix because they were personal laptops. Their boss was a big guy in the center of the room who talked over them on a conference call. Weirdest scene I've ever seen in Corporate America. That was in 2004 or so.
Hyphens in last names? (Score:5, Funny)
Re:Hyphens in last names? (Score:4, Informative)
More to the point, care about the future. Do you really want your children's children to be called Robert Smith-Schmidt-Maier-Kilgore? Not picking a single last name is just a huge FU to all future generations.
Re:Hyphens in last names? (Score:4, Funny)
Re: (Score:3, Funny)
Y-TO-K STATUS REPORT
Our staff has completed the 18 months of work on time and on budget. We have gone through every line of code in every program in every system. We have analyzed all databases, all data files, including backups and historic archives, and modified all data to reflect the change.
We are proud to report that we have completed the "Y-to-K" date change mission, and have now implemented all changes to all programs and all data to reflect your new standards:
Januark, Februark, March, April, Mak, Ju
Re: (Score:2, Informative)
wisnoskij,
I trust you understand that hyphenated last names in English have a definite form.
For example, Dr. Martin Lloyd-Jones used both his mother's and his father's last names in a hyphenated form.
When children come about, one of the names, usually the mother's last name, is dropped.
So Dr. Lloyd-Jones child would be come Robert Jones.
Now Robert Jones may want his mother's name and become Robert Smythe-Jones.
Only in America would the atrocity of a multiply hyphenated name stand a chance of occurring since
Re: Hyphens in last names? (Score:2)
Re: (Score:2)
Re: (Score:3)
Re: (Score:2)
Must have been what turned Leopold Ritter von Sacher-Masoch. He refused to accept having his name truncated and kept entering it despite the abuse he took.
Re: (Score:2)
I'm sure at one point this was discussed by Mr Cowboy and Ms Neal.
Re: (Score:2)
Hyphenated names are a good way to merge families rather than demanding that the family lineage must go through the father and/or mother. If anything, it's progressive.
For example, Mr. Johnson marries Miss Johnson, and decide to go for the classic hypenated name of Johnson-Johnson.
Later, Mr. Johnson-Johnson meets Miss Johnson-Johnson, creating the new Johnson-Johnson-Johnson-Johnson family.
Re: Hyphens in last names? (Score:2)
Re: (Score:2)
My daughter had some social issues with a class mate when she was in the 4th grade. We told her don't worry about it, in high school she'll get whats coming in the form of karma. The other girls name? Jenny Swallows.
Here we are 6 years later, and yes, Jenny does have quite a lot to deal with at school....
Re: (Score:2)
As someone who did account renames for a couple of years, I don't think this guy was joking.
Hyphenated names are the worst, just comes off as some pretentious bullshit from someone who thinks they are hanging on to some heritage prestigious names. That shit is long gone, that era is gone, the people who had hyphenated names, 80 to 200 years ago? They probably came from a very wealthy family and were holding on to some kind of name that's been around for 1000 years, but nowadays? I can't help but feel it's
Re: (Score:2)
Everyone i know of with hyphenated names ended up getting married after some professional career where the name had some value. One is a lawyer who had her own practice for 10 years before getting married. The other is a real estate agent who spent a ton marketing her name before getting married. The latter almost decided not to marry because of the issue.
Re: (Score:3)
The other is a real estate agent who spent a ton marketing her name before getting married. The latter almost decided not to marry because of the issue.
That's about the silliest thing I think I've ever heard. It's not like (a) you have to take a new name when you marry or (b) you can't take it but use your old name as a professional alias.
Re: (Score:3)
Not very well know fiction relating to this: The Man Whose Name Wouldn't Fit: Or, The Case of Cartwright-Chickering
One-line synopsis: Arthur Duane Cartwright-Chickering, is fired from his job because the new computer that processes employee files cannot handle his long name.
I have a copy of it somewhere, under stuff. I'd read it if I knew where it was right now.
Aw, come on ... (Score:4, Informative)
Her name in a (web) form would be put into a database field as a string ... the word NULL is a keyword, not a string "NULL". I am not saying that this did not happen, I just find it hard to see how a string and a database keyword could possibly be confused ?
It would be: INSERT INTO Customer (Surname) VALUES ("NULL")
not:: INSERT INTO Customer (Surname) VALUES (NULL)
Re: (Score:2)
There's some app-tier logic that tends to fuck this up. I myself had to deal with it.
In my case, I had written a set of webservices that took in parameters as POST form variables, and updated records accordingly. Parameters that were not sent were not modified. POST form variables are string-only, so I had originally planned for the empty string ("") to be the value indicating "set this field to null", but that caused problems for the web-tier developer, so (under mild protest) I made it so the string "null
Re: (Score:2)
If you must put "null" in a text field then put json in the text field.
Re: (Score:3)
INSERT INTO Customer (Surname) VALUES ("NULL")
Actually, I would hope that particular line would be more along the lines of
INSERT INTO Customer (Surname) VALUES (?)
Re: (Score:2)
You execute the code, parse the output and assume any field with the value "null" is null.
Re:Aw, come on ... (Score:5, Insightful)
Have you ever seen an application (web or otherwise) that tested an input field against the value "NULL" ? Yes: test if it is NULL (note the missing quote marks) or if it is the empty string, but not the string "NULL". I can, just about, accept that some programmer high on something illegal might have done so once, but the impression given by the article is that this happens a lot.
I find this hard to believe. If it were true then the applications involved would be open to worse exploits than simple SQL injection.
Re: (Score:2)
Re: (Score:2)
Or format is address, first name, last name and the address is "2, 24 some street" and the last name is Jennifer.
Re: (Score:2)
There is always that one developer who just knows that his SQL is 1.9 times as fast as the SQL used by frameworks, so he hard codes it.
Teh (Score:5, Funny)
An asian co-worker of mine who's family name is Teh has found that his name is almost impossible to type in tools like microsoft word, which auto correct Teh to The.
Re: (Score:2)
Not being able to type it in Word is just not knowing how to use Word. The Autocorrect options are very adjustable. Add the word to the dictionary and be done with it.
Now as for every other piece of damn software out there such as Windows 10's built in autocorrect which affects all apps, that's not so easy.
Re: (Score:2)
Re: (Score:3)
Well, don't blame me. I voted for Kodos!
Re: (Score:2)
An asian co-worker of mine who's family name is Teh has found that his name is almost impossible to type in tools like microsoft word, which auto correct Teh to The.
Failure to pay attention to "auto-correct" is a user error. Also, please not that with Word (and other word processors) this issue is handled by adding the word to your dictionary.
This is *NOT* exclusively a Microsoft Word issue, but thanks for your Micro$loth prattle.
Re: (Score:2)
An asian co-worker of mine who's family name is Teh has found that his name is almost impossible to type in tools like microsoft word, which auto correct Teh to The.
In other words, your friend was just one google answer away from finding out how to add his name to a custom dictionary in Word.
Re: (Score:2)
Yeah he knows how to do that but all the other people typing his name (like at the power utility for example) have to learn it just for him.
Re: (Score:2)
Oh dog! sometimes autocorrect is teh ducking worst :(
Re: (Score:2)
Yeah but I think it is more people sending documents to him which arrive as "Mr The". They don't even notice the autocorrect.
Last name cannot be left Blank (Score:2)
And then there's filters... (Score:4, Funny)
I've had issues a few times with filters on names rejecting mine for supposedly referring to a body part...
Re:And then there's filters... (Score:5, Funny)
I heard a story from a college friend of mine about someone in his family, his dad I think, getting in some trouble while drinking with some Army buddies. So these three friends go out and have a few too many and are picked up by the local police for public intoxication or something similar. The cop asked for their names. They replied in turn, Dicks, Cox, and Bahl (pronounced like "ball"). The cop thought they were trying to be funny. They were hauled off to the station and were only released after the First Sergeant showed up to verify their names.
Re: (Score:2)
This is a common enough event that it has a name: The Scunthorpe Problem [wikipedia.org]. Naive spam blockers are a pox on the internet.
Had a student in a DB class - last name "Long" (Score:2)
Now who is laughing at Iceland? (Score:2)
It is high time the government refuses to register any name that is not Unicode compliant, within so many bytes with some res
Re: (Score:2)
Re: (Score:2)
If a redneck can't read it must be un-American
You honour you're nick
Ridiculous Premise (Score:4, Insightful)
unicode (Score:2)
Re: (Score:2)
Slashdot has always supported at least Latin-1, which includes umlauts. Although there was a brief period of time when you had to write them as HTML entities.
Slashdot supported Unicode for a while, but support for anything beyond Latin-1 was later disabled due to the antics of page-widening trolls and such. Why reach for a newspaper to swat that fly when you can grab a sledgehammer instead?
Programers can not even figures (Score:4, Interesting)
Most programmers can not even figure out how to validate a f--ing email address, let alone a persons name.
How about they fix the email problem first and stop rejecting my email address ^_^@mydomain
Yes, you can put that on my domain listed below and email me, and yes it is a valid email address as per the RFC.
Re: (Score:3, Insightful)
Most programmers can not even figure out how to validate a f--ing email address, let alone a persons name.
How about they fix the email problem first and stop rejecting my email address ^_^@mydomain
Yes, you can put that on my domain listed below and email me, and yes it is a valid email address as per the RFC.
Because the spec for email address is is ridiculously complex. The problem isn't that programmers can't validate email addresses, it's that they can't write good specs for email address in the first place.
Re: Programers can not even figures (Score:2)
Re: (Score:3)
You're gonna have to let us know just how many "test" emails you receive in the next few days!
Re:Programers can not even figures (Score:4, Insightful)
Programmers who write database-aware programs that choke on the literal words, "null", "blank", or whose programs can't accept an apostrophe are simply incompetent or just plain stupid. There is absolutely no excuse for that kind of idiocy.
not sure I believe story (Score:2)
Remember the TREATY OF HUDAIBIYAH (Score:2)
All the names people gave themselves when we database programmers were weak is no longer enforceable once we became strong. Now we enter the name of the baby at birth in the hospital. If the name could not be entered, tough luck, pick a new name proud parents! Not born in a hospital? Hospital does not have computer? tough luck,
stupid swear filters (Score:2)
My last name tripps up decency filters on websites. My wife tried to create an account on some website and it wouldnt accept our last name in the registration field saying foul language wasn't allowed.
My last name.. Dike
Not just names (Score:2)
It happened to me in Spain. Foreigners get IDs that start with an X, while natives' IDs start with a 0. There was this system to get the payroll ant it wouldn't accept an ID if it didn't start with a 0. The IT zombies tried to convince me once and again that I was inserting my password incorrectly as the illiterate "sudaka" I obviously was. It wasn't till I used technical jargon and told them to log on using the VNC that they took me seriously and fixed it.
Re: Not just names (Score:2)
Before someone beats me to it: It's "payroll and..." not "payroll ant..."
Shit coders (Score:2)
Sorry, but if a last name of "Null" breaks your code, you're a shit coder.
The same for name fields- a 50 character limit should be the minimum. Database space is cheap, what exactly do people think they're saving by restricting a name field to 20 characters or so?
It pisses me off when a site insists that your last name HAS to be more than 2 characters, or that your first name can't be a single letter. Believe it or not, some people DO have names like that. If he was still alive someone like e. e. cummings [google.com] w
Could be worse... (Score:2)
Your name could be Cherry Chevapravatdumrong. She's one of the producers on Family Guy.
http://www.imdb.com/name/nm221... [imdb.com]
The fancy-pants name for this problem (Score:2)
The fancy-pants name for this is the semipredicate problem [wikipedia.org]
I don't recall how I stumbled upon that article; but it's one of my favorite "look at me, I can use a long word for it" things now.
They forgot about Wolfe+585 (Score:3)
osCommerce and its derivatives susceptible to this (Score:3)
I run two e-commerce stores based on osCommerce and had this exact issue with a customer whose last name was Null. There is a common function in osCommerce (tep_not_null) trying to see if the argument is empty. One of the things it looks for is the string "null". When I discovered this, I removed that part of the test (which never made sense to me.)
Names that break computers? (Score:3)
Re: (Score:3)
Just move to Scunthorpe.
Re: Mysterious East (Score:2)
Re: (Score:2)
No Intercourse, PA?
Re: (Score:2)
A few years ago these people regularly had trouble signing on to certain US based services.
Quite surprising for a place where they shit in the restroom
Re: (Score:2)
Hope the registration of your plane isn't FPL because thats part of the syntax of an ICAO flight plan creation message.
Re: (Score:3)
Re:LoL (Score:4, Informative)
Byte size have varied a lot in the past and could conceivably vary in the future too (but it's unlikely). Even the definition of byte as a concept have varied, most have byte as the smallest addressable element while some systems had it as the character size etc. Word addressed machines very seldom used byte to describe the addressable element size but some had word-sized characters... It's a mess.
A more correct name is octet which by definition consists of 8 binary digits.
Re: (Score:3)
Yes, I learned machine language on computers that literally only had 256 bytes.
I wrote a new course to teach students on a machine with only 64 bytes of RAM (1k word ROM and 128 bytes EEPROM) . In 2008 or so. Such machines still exist in staggeringly huge numbers. See, for example the PIC12F675. Their bottom end model (the 10F200) has a staggering 16 bytes of RAM and 256 words of flash.
So I'm guessing you're either an ancient greybeard or did a machine language course on a very small microcontroller.
I did l
Re: (Score:2)
From my experience many companies do things on the cheap and then keep it around forever. Fortunately any robust ORM should make any name work with relatively little effort as long as the length is less than 255. Unfortunately many developers still write code like it's 1999.
Re: (Score:2)
Re: (Score:2)
It's unreasonable to expect the Japanese to change their systems because of a problem that only occurs for badly integrated foreigners.
Doesn't Japan actually require immigrants to adopt a Japanese name?
Re:I solved the problem with my long complicated n (Score:4, Insightful)
As long as your last name isn't a single letter. That catches my psuedonym fairly regularly.
Back when I worked in medical data, I encountered real people with single-character names. It happens for real names, too. For programmers, the rule is simple: Don't use names for anything except your application's convenience, and don't have any restrictions on them. Don't even require their existence.
Re: And if you're Irish... (Score:3)
That's the anglicised version.
The gaelic original uses Ã".
Re: (Score:3)
Says the person named "Anonymous Coward".
Re: (Score:3)
Says the person named "Anonymous Coward".
Noel's [wikipedia.org] son, presumably. Posting incognito.