Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
Check out the new SourceForge HTML5 internet speed test! No Flash necessary and runs on all devices. ×
Programming Open Source Stats Technology

Women Get Pull Requests Accepted More (Except When You Know They're Women) (peerj.com) 293

An anonymous reader writes: In the largest study of gender bias [in programming] to date, researchers found that women tend to have their pull requests accepted at a higher rate than men, across a variety of programming languages. This, despite the finding that their pull requests are larger and less likely to serve an immediate project need. At the same time, when the gender of the women is identifiable (as opposed to hidden), their pull requests are accepted less often than men's.
This discussion has been archived. No new comments can be posted.

Women Get Pull Requests Accepted More (Except When You Know They're Women)

Comments Filter:
  • Just a thought... (Score:4, Insightful)

    by ClickOnThis ( 137803 ) on Wednesday February 10, 2016 @04:04PM (#51481399) Journal

    Maybe women ask for pull-requests more nicely?

    • by zeoslap ( 190553 )

      But only when you don't know they are women...

    • There's no need for comparative statistics for men vs women, which leave you trying to control for all sorts of nebulous factors like how nicely they make requests, or how the genders might code differently.

      All you have to do is take a bunch of coders (men or women, doesn't matter), and have them submit a bunch of code online using a male persona, using a female persona, and anonymously (or at least gender-neutral). Then compare acceptance rate for each individual. That neatly eliminates all other fact
  • I am a cisgender hermaphrodite who identifies as male. Nobody pulls me!
  • by Anonymous Coward on Wednesday February 10, 2016 @04:18PM (#51481521)

    What is a pull request? Is it a good or bad thing?

    • by Verdatum ( 1257828 ) on Wednesday February 10, 2016 @04:28PM (#51481637)
      It's a term related to git, the tool a lot of us use to manage our source-code and revision history. A pull request is when you finish a task and you send your code changes up to the authorities of the project. When a pull request is approved, it means their code changes have been applied to the project.
    • Even those of us who are career programmers aren't necessarily git users, and I'm pretty sure "pull request" is a git-ism. I think it's kind of like a commit (or maybe branch merge) in more traditional version-control systems, except under the control of the project manager instead of the person submitting the code.

      • by bluefoxlucid ( 723572 ) on Wednesday February 10, 2016 @04:38PM (#51481713) Journal

        For distributed version control systems like git, mercurial, bazaar, bitkeeper, and darcs, there's no central repository. You can have an authoritative source, which is just like every other source aside from a fancy name tag. A pull request is a request to pull (and merge) a branch from another repository.

      • by tnk1 ( 899206 ) on Wednesday February 10, 2016 @04:46PM (#51481785)

        A pull request is a definitely a "git-ism". It's a request to other coders to update their own local git codebase to incorporate the changes that the requester has made. So it is like a "request to commit" to some degree, but allows for decentralization.

        So, you can accept a pull request to your own personal branch/fork and it doesn't have to go on the main branch. This allows two (or more) coders to sync their branches with each other, without necessarily impacting the main branch. Then at some point, when there is full agreement among the collaborators about what they want to submit to main, the merged branch with all their work (or any one of the up-to-date branches) has a PR generated for it, and the request is made to update the main. (Or perhaps their branch just becomes a fork of the original code and now that branch is "main").

        Obviously, if the PR is accepted to the main, there could be rules about who can do it and/or under what circumstances. There may be a main branch committer, or there could just be rules to allow anyone to commit, as long as they aren't the author and that they have verified the changes meet the appropriate code review and testing requirements. There's no actual difference in the mechanical aspects of it; the main branch works just like any other branch aside from the designation of that branch as the "authoritative" code base for the builds and release candidates.

  • Self-Selection? (Score:5, Insightful)

    by Diss Champ ( 934796 ) on Wednesday February 10, 2016 @04:21PM (#51481563)

    Is it possible that those women who don't feel it necessary to point out their gender in situations where gender doesn't matter tend to also be those more likely to communicate well?

    Is it possible that those women who make it a point to draw attention to their gender in situations where there is no reason to bring up gender at all, are also more likely to be less convincing regarding the usefulness of their work?

    • Re:Self-Selection? (Score:5, Insightful)

      by Pseudonymous Powers ( 4097097 ) on Wednesday February 10, 2016 @04:29PM (#51481639)

      Interesting point. Also worth asking is:

      Is it possible that those developers who don't feel it necessary to point out their favorite college sports team in situations where their favorite college sports team doesn't matter tend to also be those more likely to contribute worthwhile changes? Is it possible that those developers who make it a point to draw attention to their favorite college sports team in situations where there is no reason to bring up their favorite college sports team at all, are also more likely to be less convincing regarding the usefulness of their work?

      • Re:Self-Selection? (Score:5, Insightful)

        by lgw ( 121541 ) on Wednesday February 10, 2016 @05:21PM (#51482133) Journal

        possible that those developers who don't feel it necessary to point out their favorite college sports team in situations where their favorite college sports team doesn't matter tend to also be those more likely to contribute worthwhile changes?

        The double-negative makes it hard to parse, but I think I agree: "people who point out unimportant distractions about themselves have lower-quality submissions". Seems perfectly reasonable to me.
         

        • by zarr ( 724629 )
          Using your real name when submitting a PR is not "pointing out an unimportant distraction about yourself".
          • by lgw ( 121541 )

            Never use your real name on the internet. No good can come of that.

            • Never use your real name on the internet. No good can come of that.

              I use my real name when doing things on the internet in a professional context. That includes github. If I'm not identifiable as me, then how will anyone know who I am, or how to tie the github account to stuff I do with my real name, such as my company and academic work? This is the case with very many professionals, such as most of the prominent linux kernel developers.

              • by lgw ( 121541 )

                I take your point for the few people working professionally with GitHub instead of the normal case for software devs.

                In my case, my professional name isn't my legal name - the latter isn't anywhere on the internet. But to your point, my professional name does indicate my sex, and if I were trying to make a living with open source it would show up in GitHub.

                • In my case, my professional name isn't my legal name -

                  That's quite an unusual case. What made you go that way?

                  But to your point, my professional name does indicate my sex, and if I were trying to make a living with open source it would show up in GitHub.

                  Same. I don't make a living with open source, but I have some open libraries related to work I do on github. They've proven moderately popular within the application domain. I mostly made money in that area consulting/contracting and it helps to be visible.

                  • That's quite an unusual case. What made you go that way?

                    People don't take you as seriously when you sign your emails "Talula Does The Hula From Hawaii".

                  • by lgw ( 121541 )

                    I value my privacy. Always have. Seemed like an obvious way to go. But even my professional name only appears on the internet in my linkedin, and in the minutes of a standards committee I worked with. I can't imagine using my name for any forum, or my hobby github work, or whatever.

      • by ceoyoyo ( 59147 )

        I certainly discriminate, not altogether unconsciously, against people who consistently bring up their favourite sports team in non-sports related situations.

    • by mwvdlee ( 775178 )

      What do you mean by "pointing out their gender"? My avatar on Github is just a portrait photo of me, looking like the guy I am.
      If a women is using a portrait of herself as her avatar, does that count as "pointing out their gender" or is it simply a portrait photo?

      • That's a good question. I don't use Github, so didn't know folks there tended to use actual photos of themselves as a matter of course. Most folks in the environment I'm in have avatars that are not portraits of them- if they bother with one at all.

        I suppose the following additional analysis could be done:
        1. Do men who look like women tend to statistically match women or men?
        2. Do women who look like men tend to statistically match women or men?

        Also perhaps interesting- do men whose gender are not made appa

        • by blueg3 ( 192743 )

          Also perhaps interesting- do men whose gender are not made apparent statistically do better than those who do?

          You know the study itself is a pretty short read, right?

          Anyway, yes. Everyone, both male and female, who have "gender-neutral" GitHub profiles had pull requests accepted at a higher rate than everyone who had "gendered" profiles. The difference between gendered vs. gender-neutral profile was larger than the difference between genders. Note that all that is for "outsiders" -- insiders have a higher acceptance rate overall with seemingly little difference between (male, female) x (gendered, gender-neutral).

          • Re:Self-Selection? (Score:5, Interesting)

            by ceoyoyo ( 59147 ) on Wednesday February 10, 2016 @06:48PM (#51483077)

            First impression: somebody needs to learn about statistics that have more than one predictor variable.

            Second impression: despite the lack of appropriate analysis, the differences in figure 5 are big enough to be reasonably clear. It looks like there is discrimination against anybody who has a gendered profile (maybe maintainers don't like pictures?). This discrimination might be slightly greater against outside women, and is fairly likely greater against inside men.

            Third impression: the paper and the Slashdot summary have a strong gender bias; they mention only the small and borderline significant anti-female bias while ignoring the more significant anti-male bias and also the much larger anti-(either) gender identifiable bias.

      • by blueg3 ( 192743 )

        It doesn't appear that the study considered "pointing out their gender" at all.

        Rather, they tried to determine whether the gender of a GitHub profile was readily apparent.

        Per the description of their methodology, if you use a profile image (rather than an identicon), you are automatically considered "gender is readily apparent". If that test fails, they look at the confidence level output by a gender-guessing bot of some kind. If that fails, they have a method for estimating the confidence level of a panel

    • Re: (Score:2, Insightful)

      by Xtifr ( 1323 )

      By "feel it necessary to point out their gender", do you mean going back in time and forcing their parents to give them a gender-neutral name like "Chris" instead of an obviously gendered name like "Maria"? Because I don't quite know how to tell you this, but time travel hasn't actually been invented yet... :D

      • Because I don't quite know how to tell you this, but time travel hasn't actually been invented yet...

        Time travel was invented next year.

    • This is a really obnoxious post.

      When a man (like me) has a git hub account under his real name he's just a man with was git hub account. Totally neutral. When a woman uses her name she's "making a point to draw attention to get gender".

      That's a colossal case of double standards that you and everyone who modded you up is guilty of.

      • I don't think the OP was talking about people who use their real names. Taking Slashdot as an example, I think that examples like girlintraining [slashdot.org] or Gaygirlie [slashdot.org] are what was being referred to, the latter of which also decided to use her username to reveal her sexuality in addition to her gender. People who use their real names aren't doing so necessarily to point out their gender, they're just using their real name. People who use an anonymous username that reveals their gender apparently think that everyon

        • I don't think the OP was talking about people who use their real names.

          And that's the problem. Huge numbers of people use their real names on github.

          Taking Slashdot as an example, I think that examples like girlintraining or Gaygirlie are what was being referred to,

          But not JustAnotherOldGuy or King Neckbeard.

          People who use their real names aren't doing so necessarily to point out their gender, they're just using their real name. People who use an anonymous username that reveals their gender apparently think

  • by Punko ( 784684 ) on Wednesday February 10, 2016 @04:21PM (#51481565)
    "pull requests" heh.

    Posted intentionally to lampoon typical responses.

    I am not surprised that requests are not followed up on when a female calls for them, nor am I surprised that their responses are more often responded to when the gender is hidden/neutral. What I am surprised is that female pull requests are "larger and less likely to serve an immediate project need". Does this mean that female developers are concentrating on "big picture features" more often ?
    • Re: (Score:3, Insightful)

      It means their code is less-important and so is not scrutinized as hard.
      • by AmiMoJo ( 196126 )

        No, it means that code is less likely to be small bug fixes or reactions to emerging issues, and more likely to be things like new features or architectural improvements.

      • How the fuck is leaping to a conclusion for which there isn't a shred of evidence considered "+5 insightful"?

    • by Kjella ( 173770 )

      I am not surprised that requests are not followed up on when a female calls for them, nor am I surprised that their responses are more often responded to when the gender is hidden/neutral. What I am surprised is that female pull requests are "larger and less likely to serve an immediate project need". Does this mean that female developers are concentrating on "big picture features" more often ?

      Would that be so astonishing? We come from a hunter-gatherer society where those out hunting had to think on their feet and seize the opportunities where they presented themselves. Gathering is a lot more about planning and organization, those berries won't run away but you have to harvest when they're ripe. And the women were also taking care of the children, sick and elderly for the long term survival and passing on knowledge of the tribe. We've had many thousands years of selection pressure to that effec

  • RTFA ... (Score:4, Funny)

    by Obfuscant ( 592200 ) on Wednesday February 10, 2016 @04:27PM (#51481625)
    Read TFA, at least through the "author contribution" section.

    Clearly, Clarissa didn't contribute anything, and Chris may or may not have contributed anything significant, it's hard to tell.

  • by avandesande ( 143899 ) on Wednesday February 10, 2016 @04:29PM (#51481641) Journal
    After reading the article it appears that women lead pull acceptance in every case except for one edge case, and not by very much(its like 64% vs 63%). Nothing interesting at all here.
    • by alexhs ( 877055 ) on Wednesday February 10, 2016 @06:48PM (#51483079) Homepage Journal

      To be fair, Slashdot's summary is not worse than the paper's summary.

      There's a long list of issues with their methodology, and they make a fair assessment of these in the "Threats" part, which BTW should be discussed in the article, and not in the appendices.

      As a whole, this paper reeks "We wanted to show how / how much women were discriminated against in Open Source. Our findings showed the opposite, so we kept making up criteria until one would exhibit (barely) the bias we wanted to denounce."

      Of course when you're doing that, you're just begging to fall for this [xkcd.com].

      Non-exhaustive list of other issues I noticed:
      - Weighing issues: for example, how many commits from outsiders vs insiders. Given that, overall, women get better acceptance, I can conclude than insiders commit more than outsiders (in their dataset)
      - Missing stats (for example, we get gendered stats on whether a pull request is linked to an issue, but no insider / outsider distinction)
      - Plain old lies in the summary ("when a woman’s gender is identifiable, they are rejected more often" vs "Women have lower acceptance rates as outsiders when they are identifiable as women.")
      - Failure to mention that the error bars are for the strict dataset. I suppose this is standard practice, but the dataset error bars are probably swamped by the non-representativity of the dataset in the first place, and the methodology shortcomings, which means that they're misleading (nobody cares about their dataset). They don't make any effort to evaluate these errors (obviously that would be the hard part), and leave us with some hand-waving like "we are somewhat confident that robots are not substantially influencing the results".
      - Graphs that start at 60% to exaggerate differences (without using broken axis)
      - Using "theory" for "hypothesis"

  • by Verdatum ( 1257828 ) on Wednesday February 10, 2016 @04:31PM (#51481653)
    I honestly don't mind submissions about gender issues on /. But I do have a problem with posting articles that have not yet been peer reviewed. It is at least good of the link to make that perfectly clear.
    • In addition, it (at least the summary) states that there is gender bias even when the gender is unknown. It seems to me that if gender is unknown and unassumed there can't be bias, but there can be patterns of behavior based on other criteria that happens to correlate to gender but is not driven due to a gender bias.
  • Baloney Charts (Score:5, Interesting)

    by avandesande ( 143899 ) on Wednesday February 10, 2016 @04:39PM (#51481721) Journal
    Charts that show a percentage range (ie. 60% to 80%) instead of the actual percentage (0% to 100%) to exaggerate differences between amounts on the chart.
    • by blueg3 ( 192743 )

      Chart on page 10 is completely acceptable. It contains a lot of data, all of which is constrained to the 60-90% range, the range is clear, and the chart isn't really deceiving.

      Page 13 similarly has a lot of data and doesn't really deceive. All extending the bars down to 0% and up to 100% would do is make it harder to read. However, it would work better as a table.

      Chart on page 15 is a standard example of data that doesn't need a bar chart. Even with the narrowed range, most differences are difficult to see.

      • Yes, it is technically correct, but misleading in the sense it makes a 5% difference (what is the margin of error again?) look bigger than it is.
    • by MobyDisk ( 75490 )

      That chart is correct and follows best-practices. Only column charts must have the axis at zero.
      It is okay not to start your y axis at zero [qz.com]
      When should the y axis of a graph start at zero? [stackexchange.com]

      And a fun one:
      The most misleading charts of 2015: fixed [qz.com]

      • LOL thanks for the links

        From number 1:
        Always use a zeroed y-axis with column and bar charts. Of course column and bar charts should always have zeroed axes, since that is the only way for the visualization to accurately represent the data. Bar and column charts rely on bars that stretch to zero to accurately mirror the ratios between data points. Truncating the axis breaks the relationship between the size of the rectangle and the value of the data. There is no debating this one (except for a few except
  • by Anonymous Coward on Wednesday February 10, 2016 @04:47PM (#51481803)

    First off they trumpet the fact that they discovered that women's merge acceptances were higher than mens. It's only when they sliced the one hundred thousands of accounts for "gender confirmation" that they decided that bias existed because success rates went from 72% to 64% - The error deviation of that alone should cover the spread.

    Secondly the sample rating is awful - They compare TWO MILLION male checkins to ONE HUNDRED THOUSAND female checkins without any criteria for context, quality, need or style... just "quantity" and say that because the PERCENTAGE RATES FOR ACCEPTANCE are "higher" it must mean the women programmers are "Better" when comparing 2 sample sets with 20x the difference of checkins as they're all EQUAL.

    Sorry. That's BS.

    This is not science, this is propaganda statistics and poor statistics at that but I'm sure they made full use of their government funding to study gender issues in STEM fields.

  • by kevingolding2001 ( 590321 ) on Wednesday February 10, 2016 @06:09PM (#51482681)

    The interesting thing the study actually found was that pull-request acceptance rates dropped for BOTH males and females when the gender of the requester could be inferred from their username or avatar picture. In some categories that rate dropped more for males, and in others the rate dropped more for females.

    But they ignored the drop in rates for males and considered only the drop in rates for females when jumping to their conclusion of "gender bias".

  • I don't doubt there's bias against women, nor I want to mansplain the results at all. However I care deeply about good science and reliable facts. This article is not very good at showing clearly that this bias exists. Here are a few major problems with it:

    1. Are the samples of women and men who post on GitHub representative of all open source programmers? I would think that women tend to contribute publicly less than man, and tend to disclose their gender less than men, and this probably biases the sample.

  • Why is this poorly-researched inflammatory crap on Slashdot again?

    Is someone looking to siphon yet more funding away from gender-neutral coding projects and into more "X for women" programmes?

  • The study has not shown what the submission or the study says it shows. What it shows is that when separated into two groups, women who self identify as women and those who don't, the two groups have their submissions pulled at different rates.

    There is nothing in the study to show that the two groups are comparable in their ability to code. Another way to look at the numbers in the study would be to say that women who self identify as women are not as good at coding as those who don't. Both statements are e

  • by thecombatwombat ( 571826 ) on Wednesday February 10, 2016 @11:52PM (#51484711)

    The whole premise seems to be accepted pull requests = accepted developers. I mean they say:

    "To what extent does gender bias exist among people who judge GitHub pull requests?
    To answer this question, we approached the problem by examining whether men and women are equally likely to have their pull requests accepted on GitHub, then investigated why differences might exist."

    The authors note that women are more likely to submit pull requests that aren't tied to existing open issues. They seem to conclude that this reinforces the idea that women have the best track records, that these requests are the hardest to get accepted.

    "Thus, if women more often submit pull requests that address an immediate need and this is enough to improve acceptance rates, we would expect that these same requests are more often linked to issues."

    I interpret that totally the other way. The paper equates getting a pull request accepted with being accepted, that's just not how (in my experience) development works. If you submit a patch for some feature add that only you've thought of, and it conflicts with nothing else, it's easy for a maintainer to accept. A patch for a known, open issue is much more likely to have regression considerations, and compete with other patches. If five people all submit a patch for one issue, odds are good at least four of them are going to be rejected. It's kind of like measuring an employee's productivity by how many lines of code they write. Experienced developers see that as largely silly.

  • by tgv ( 254536 ) on Thursday February 11, 2016 @03:32AM (#51485407) Journal

    First statistical conclusion in the article is faulty: the significance is based on a chi square with df = 3,064,667. Every difference is significant with a df that high. The second statistical conclusion has the same error: significant, but the difference here is marginal. These people should really think if the underlying data truly only represents a difference in gender and all other possible variables are identical.

    But a large part of the article focuses on arguments like "they feel dejected" while in reality the numbers hardly differ. Not only that, they are even in the women's favor, even on the first request. How can you then complain about feelings of dejection or abandoning because of "an unreasonably aggressive argument style" (as if women are by definition incapable of that)? No, it's just clutching at straws because they have to write an article.

    But it's the final graph that is the nail in the coffin of this article: even with their self-chosen statistics, there is no difference in acceptance rate for men and women when gender is known (although "known" is too strong a word), even in the outsider category. They then phrase it like this: "There is a similar drop for men, but the effect is not as strong" while not having even the cheapest statistical argument to support it. That's the best they can come up.

    So the conclusion of this article should be: women have a slight advantage in pull requests on github. The rest is FUD.

It seems that more and more mathematicians are using a new, high level language named "research student".

Working...