arethinn: abstract purple lines on black background (general (starthreads))
100 _1 Hooks, Bell.
245 10 We real cool : ǂb Black men and masculinity / ǂc Bell Hooks.


I find it interesting and somewhat irritating that cataloging practice apparently denies bell hooks her preferred capitalization of her name. I can understand why perhaps the authority file, which controls the form of the name found in the 100 field, needs to be standardized on the "usual" capitalization for data-entry-conformity reasons (as indeed the Wikipedia article I linked to capitalizes "Bell" in the URL). But the statement of responsibility (the part after "/ ǂc" in the 245 field) is supposed to be as on the item, with the exception of dropping titles (only names as such are used). It's lowercased in the LC Cataloging-in-Publication block on the back of the title page, which is where the data for these fields usually gets initially slurped from, which probably means someone in the train DLC ǂb eng ǂc DLC ǂd UKM ǂd BAKER ǂd BTCTA ǂd YDXCP ǂd OCLCQ ǂd DEBBG ǂd OCL ǂd OCLCQ ǂd RV8 changed it at some point.* Pfeh.

I wonder if the upcoming switch from AACR2 to RDA**, which is really big on "transcribe what you see" to the point that people transcribe TITLE PAGE STUFF IN ALL CAPS JUST THAT WAY MAKING IT RATHER HARD TO READ, will allow "bell hooks" in this instance.

I think I may change it in our local copy of the record anyway. If it was good enough for LC when providing the CIP block, it's good enough for me.

--

* Except for "eng", which is a language code, these are called "OCLC symbols" and identify various cataloging institutions.

** Sets of cataloging rules. Don't worry about it.

Date: Mar. 1st, 2012 12:00 am (UTC)From: [personal profile] bubbleblower
bubbleblower: cropped head shot of me with nebula background (Default)
This got me to wondering about things like whether searches that users might do are case-sensitive and thus might fail to find the name if said user isn't aware of the non-standard capitalization. Or is it customary to keep lists of common misspellings on file to catch errant searches?

I was also reminded of http://xkcd.com/327/ and what, if any, limits there might be on one's right to choose one's own name.

Date: Mar. 2nd, 2012 04:26 pm (UTC)From: [personal profile] logomancer
logomancer: Xerxes from System Shock 2 (Default)
While Google's algorithms are Sooper Top Sekrit, my educated guess is that they use a modified version of Bayesian filtering with a near-neighbor finding algorithm to detect simple letter transpositions. When someone types in a word with few results in a search bar, Google checks to see if there's a word that's close with more results, and see if that's what the person meant. If they click on the link, that's a yes, and the probability of the two being linked together goes up. If enough of this happens, then the substitution happens automatically. Of course, if they click on a different link, that's a no, and the filter takes that into account. In the end, it's all statistical analysis and a massive ton of storage, which is how machine learning has progressed for years now.

Naming systems in computer databases are pretty damned inflexible, really, and not just for people who case their name differently; a lot of the computer systems assume you have a Western-style name, with a given name, a middle name, and a surname, and maybe a title and a suffix. Spanish/Portuguese-style names, for instance, with more than one given name and more than one surname, are not properly reflected in most database structures. Arabic names are similarly problematic, with one given name and one surname, but multiple patronymics. As database structures tend to endure unless there's something seriously wrong with them, I anticipate that this is a problem that will last for quite a while, sadly.

Profile

arethinn: glowing green spiral (Default)
Arethinn

April 2025

S M T W T F S
  12345
67 89101112
13141516171819
20212223242526
27282930   

Expand Cut Tags

No cut tags

Style Credit

Page generated Jul. 4th, 2025 04:14 pm
Powered by Dreamwidth Studios