Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Databases Oracle Python Software Wikipedia

Python-LMDB In a High-Performance Environment 98

lkcl writes: In an open letter to the core developers behind OpenLDAP (Howard Chu) and Python-LMDB (David Wilson) is a story of a successful creation of a high-performance task scheduling engine written (perplexingly) in Python. With only partial optimization allowing tasks to be executed in parallel at a phenomenal rate of 240,000 per second, the choice to use Python-LMDB for the per-task database store based on its benchmarks, as well as its well-researched design criteria, turned out to be the right decision. Part of the success was also due to earlier architectural advice gratefully received here on Slashdot. What is puzzling, though, is that LMDB on Wikipedia is being constantly deleted, despite its "notability" by way of being used in a seriously-long list of prominent software libre projects, which has been, in part, motivated by the Oracle-driven BerkeleyDB license change. It would appear that the original complaint about notability came from an Oracle employee as well.
This discussion has been archived. No new comments can be posted.

Python-LMDB In a High-Performance Environment

Comments Filter:
  • by i kan reed ( 749298 ) on Friday October 17, 2014 @12:15PM (#48170191) Homepage Journal

    At some point there will be an article on Wikipedia, that only meets Wikipedia's notability requirements due to media spillover complaining about the notability requirements.

    • At some point there will be an article on Wikipedia, that only meets Wikipedia's notability requirements due to media spillover complaining about the notability requirements.

      yaaay! :) works for me. wasn't there a journalist who published a blog and used that as the only notable reference to create a fake article? :)

      • >wasn't there a journalist who published a blog and used that as the only notable reference to create a fake article? :)

        I can recommend you a fascinating pair of books: The Secret History of the War on Cancer by Devra Davis and The Merchants of Doubt by Naomi Oreskes. There is a very long history of circular self-reference among dishonest journalists and scientists; for example Fred Singer would write a letter to the Wall Street Journal, then write an Op-Ed piece for a smaller outfit using the Wall St

    • by Alsee ( 515537 ) on Friday October 17, 2014 @03:48PM (#48172205) Homepage

      I was involved in a example of this recently. TheFederalist.com is a one-year-old rightwing website. They ran an attack piece on Neil degrasse Tyson. It was picked up by the rightwing blogosphere, but was totally non-newsworthy (as established by the lack of news coverage). Someone tried to insert it into Wikipedia's biographical page on Neil degrasse Tyson. That edit was promptly reverted because Wikipedia has a policy of being extremely cautious about adding negative material to the Biography of Living Persons. A blogosphere rant against someone doesn't qualify. So then TheFederalist.com writer started screaming CENSORSHIP and equating Wikipedia editors to religious fundamentalist terrorists for not writing his hit-job into Tyson's biography. *THIS* picked up some minor coverage for the story from other sources.

      At this point someone noticed that we had an tiny article page on TheFederalist.com, and the only sourcing for that article was TheFederalist itself and a blog page from MediaMatters. The TheFederalist page was nominated for deletion. A massive effort was made by many people trying to find an sources talking about TheFederalist.com, searching for any sources we could use to fix the article. The search turned up squat. Then TheFederalist.com wrote about Wikipedia nominating their article for deletion, and *THAT* got picked up by a few sources. And *THOSE* stories gave us enough information about TheFederalist.com in order to write a an article on it.

      So yeah..... it was painfully circular. ~~~~

      -

  • by account_deleted ( 4530225 ) on Friday October 17, 2014 @12:15PM (#48170195)
    Comment removed based on user account deletion
  • If Wikipedia was a person I would smack it upside the head for shit like this. There is absolutely no reason not to have an article on LMDB, and deleting a perfectly good article for no reason is evidence of a mental disorder. It's not like they have to spend an extra penny for a piece of paper to hold the article, possibly making the book too thick. Wake up.

    Yeah, I'm FAR from a Wikipedia hater, but when it pulls shit like this it reveals its stupidity.

    • Wikipedia has rules. While those rules exist for good reasons, by nature of being rules they are most easily navigated by bureaucratically minded, officious mindset.

      People have this false mindset where wikipedia, by virtue of their "anyone can edit" policy is an infinite bastion of free expression. When really, it's just a whole lot of people disagreeing and squabbling and working and editing to make and upkeep an encyclopedia.

      • Re: (Score:2, Funny)

        by Anonymous Coward

        [citation needed]

        • by Alsee ( 515537 )

          <ref>https://en.wikipedia.org/wiki/Wikipedia:List_of_policies_and_guidelines</ref>

          -

      • If the rules legitimately preclude a page on LMDB, they certainly should preclude individual pages for MySQL backends like Falcon, Aria, and Toku, shouldn't it? And yet there they are.

      • Wikipedia has WP:rules. While those WP:rules exist for good WP:reasons, by nature of being WP:rules they are most easily[opinion] navigated by WP:bureaucratically minded, officious mindset.

        FTFY. And [citation needed]

      • Some of Wikipedia's rules are ass-backwards asinine. Such as Avoid Trivia [wikipedia.org]

        One man's trivia is another man's noise.

        Oh I see, so only if it is _popular_ does the "truthiness" count.

        Fuck that. I want an _inclusive_ dictionary / encyclopedia / reference, not an _exclusive_ based on some "arbitrary" rules simply because something is not popular. I am there in the first place to _learn_ about things I don't know about ! Not because some asshat decided "not enough people care about this topic."

        It is not like a

    • If Wikipedia was a person I would smack it upside the head for shit like this. There is absolutely no reason not to have an article on LMDB, and deleting a perfectly good article for no reason is evidence of a mental disorder. It's not like they have to spend an extra penny for a piece of paper to hold the article, possibly making the book too thick. Wake up.

      Yeah, I'm FAR from a Wikipedia hater, but when it pulls shit like this it reveals its stupidity.

      Wikipedia has a pretty standard bar for articles it should curate (which is decidedly not free) and that is, does the subject have any sort of peer-reviewed literature available (and source code comments, howtos, etc don't count)? This goes directly to the "no original research" policy, which basically asserts that Wikipedia editors (including the one that created the page) should not be writing the article based on their original work, since Wikipedia is not the place for peer review to happen. Long stor

    • If Wikipedia was a person I would smack it upside the head for shit like this.

      If Wikipedia were a person, you could just edit his face.

    • If Wikipedia was a person I would smack it upside the head for shit like this. There is absolutely no reason not to have an article on LMDB, and deleting a perfectly good article for no reason is evidence of a mental disorder. It's not like they have to spend an extra penny for a piece of paper to hold the article, possibly making the book too thick. Wake up.

      Speaking only from personal experience
      there seems to be a disconnect between what people actually derive value from and rules + perhaps original intent of Wikipedia.

      We seem to be stuck in a situation where lack of enforcement itself is supporting quite a bit of value and interest in the site... A situation ripe for leverage by personal whims and selfish persuasion.

      I don't think there are any easy answers yet the rampant deletions are particularly annoying and unhelpful to me as a user of Wikipedia.

  • Deletionists (Score:4, Insightful)

    by HeckRuler ( 1369601 ) on Friday October 17, 2014 @12:49PM (#48170535)

    I never understood the deletionist mentality on Wikipedia. But there's a whole group of people that want to remove information from the public view.

    I semi-understand the idea that this "very important" encyclopedia is "too important" for such things as a page for each character from a game I never played. And somehow by culling these frivolous thing they somehow make wikipedia higher quality on the whole? Maybe? Kinda? I don't think these people understand how search works.

    There are the obvious shills and PR people that want to sweep things under the rug. These are nefarious and to be found and fought.

    There are fools who think it's expensive to store this information. As if an edit-war to remove it was cheaper.

    I understand people don't want articles that are just free advertising. But I doubt anyone is going to delete the page for Monanto.

    But fundamentally, I just don't get their worldview.

    • by Anonymous Coward

      My theory is that they have a craving to exercise power over others, and Wikipedia deletion is as close as they can get to that goal. If they were intelligent enough and introspective enough to figure out their own motives, they'd be cops, teachers, politicians or drill sergeants. But since they either aren't smart enough or aren't self-aware enough to see this, and they get a righteousness buzz whenever they delete somebody else's work, they'll look for clever-sounding rationales to justify their behavio

    • by Alsee ( 515537 )

      The "worldview" is that Wikipedia is supposed to be an Encyclopedia. Wikipedia is the Encyclopedia That Anyone Can Edit, not a public blog-space. The only thing that prevents Wikipedia from becoming a scribble-board are the Wikipedia Policies, and editor dedication to those policies. If you throw out Wikipedia content-verifiability policies then it would start looking a lot less like an Encyclopedia.

      I don't think these people understand how search works.

      How search works: If you type a search term into Google you'll get random writings about the topic, no matter

      • Sure sure, verifiable is important. But even with something to verify the information on the page, you still get those deletionists that will claim notability, and fast-track the page for deletion.

        I don't give a rats fucking ass if you don't think that rat-asses are notable or not. If there are citable facts on the page, LEAVE IT BE. And let me make this clear. In your VERY NEXT BREATH you went from "it'll be a scribble-board without verifiability" to "no matter how trivial".

        Who the fuck cares who trivial i

        • by Alsee ( 515537 )

          Sure sure, verifiable is important. But even with something to verify the information on the page, you still get those deletionists that will claim notability, and fast-track the page for deletion.

          If you were paying attention, I explained exactly how to prevent an article from being deleted. Include a couple of independent Reliable Sources talking about the topic, saying things that can be used to build an article. Once you have that then primary sources can help expand the article if used properly, but we have rules against articles built solely with primary sources because primary-source-only articles raise a shitton of problems.

          But no, you're high and mighty and you just don't give a fuck about how many pokemon there are.

          What the hell are you ranting about? Not only does Wikipedia have an a

  • Why don't they publish the algorithm on wikipedia instead? Putting a non-popular library on wikipedia seems a bit extreme. It may well include all github projects including mine...

  • by MSG ( 12810 )

    I'll start with: LMDB is awesome, and I am super SUPER impressed with OpenLDAP's benchmarks over the last several years. I do not question LMDB's worth.

    I'm just not really sure that this letter is evidence thereof. The author got poor performance from a SQL database with no indexing, which degraded as the number of records grew? You don't say! A database that has to do a full scan for reads performs poorly?

    Surprise about load average seems equally naive. If you fork a bunch of processes that are doing

    • The author got poor performance from a SQL database with no indexing, which degraded as the number of records grew? You don't say! A database that has to do a full scan for reads performs poorly?

      yes. it was that i had to do that analysis in a formal repeatable independent way, which i had never done before, and i was very surprised at the poor results. i was at least expecting a *consistent* and reliable rate of... well, i don't know: i was kinda expecting PostgreSQL to be top of the list and i was kin

      • by MSG ( 12810 )

        so it's not the *actual* loadavg that is relevant but that the *relative* loadavg before and after that one simple change was so dramatically shifted from "completely unusable and in no way deployable in a live production environment" to a "this might actually fly, jim" level.

        That's not loadavg, that's IO latency. You should probably be using iostat to get useful numbers.

        loadavg is completely useless when discussing system performance, it is in no way related.

        • by lkcl ( 517947 )

          That's not loadavg, that's IO latency. You should probably be using iostat to get useful numbers.

          oo, thank you very much for that tip, i'll try to pass it on and will definitely remember it for the next projects i work on. thank you.

          • by MSG ( 12810 )

            If you haven't used iostat before: Run "iostat -x 2" to get a report of block device utilization every two seconds. Ignore the first report; it details utilization since system boot. All subsequent reports will be for the period after the previous report.

            If you can repeat your earlier tests, and want to see if there's actually a Linux bug, compare numbers when the program opens DBs before forking, and when it opens them after. If you're seeing bad latency in the former case, but similar B/s, that might

    • no NSA project here, unless Digital Pine is a subsidiary? small world, isn't it?
  • ... only hypothetically, via Pine Digital's business which prominently displays in wikileaks: https://www.wikileaks.org/spyf... [wikileaks.org]

Solutions are obvious if one only has the optical power to observe them over the horizon. -- K.A. Arsdall

Working...