Skip Navigation
Gender bias in open source: Pull request acceptance of women versus men
  • From the post's link:

    We hypothesized that pull requests made by women are less likely to be accepted than those made by men. Prior work on gender bias in hiring – that women tend to have resumes less favorably evaluated than men (5) – suggests that this hypothesis may be true.

    To evaluate this hypothesis, we looked at the pull status of every pull request submitted by women compared to those submitted by men. We then calculate the merge rate and corresponding confidence interval, using the Clopper-Pearson exact method (15), and find the following:

    Open Closed Merged Merge Rate 95% Confidence Interval
    Women 8,216 21,890 111,011 78.6% [78.45%, 78.87%]
    Men 150,248 591,785 2,181,517 74.6% [74.56%, 74.67%]

    4 percentage point difference overall.

    Pull requests can be made by anyone, including both insiders (explicitly authorized owners and collaborators) and outsiders (other GitHub users). If we exclude insiders from our analysis, the women’s acceptance rate (64.4%) continues to be significantly higher than men’s (62.7%) (χ2(df = 2, n = 2, 473, 190) = 492, p < .001)

    Emphasis mine. that's 1.7 percentage points.

    The final paragraph also omits how the acceptance changes after gender is "revealed" (username, profile image). The graph doesn't help either

    For outsiders, we see evidence for gender bias: women’s acceptance rates are 71.8% when they use gender neutral profiles, but drop to 62.5% when their gender is identifiable. There is a similar drop for men, but the effect is not as strong. Women have a higher acceptance rate of pull requests overall (as we reported earlier), but when they’re outsiders and their gender is identifiable, they have a lower acceptance rate than men.

    So women drop from 71.8% to 62.5% = 9,3 percentage points, and they say it's more than men, but don't reveal the difference. Only graph has an indication (unless I'm missing a table) and it may be 5 (?) percentage points for men. Which would be about 4 percentage points between both genders.

    Figure 5: Pull request acceptance rate by gender and perceived gender, with 95% Clopper-Pearson confidence intervals, for insiders (left) and outsiders (right)

    The conclusion:

    Our results suggest that although women on GitHub may be more competent overall, bias against them exists nonetheless.

    That's quite exaggerated for <=5 percentage points. Especially for the number of people involved.

    Out of 4,037,953 GitHub user profiles with email addresses, we were able to identify 1,426,121 (35.3%) of them as men or women through their public Google+ profiles.

    Maybe I missed it, but how many of those were women and how many made PRs?

    in a 2013 survey of the more than 2000 open source developers who indicated a gender, only 11.2% were women

    Let's compare the PR rate per gender:

    Let's say the percentage of women did not increase since 2013, which I'd find difficult to believe, that's 1,269,247 men and 156,873 women. Men made 150,248 + 591,785 + 2,181,517 = 2,923,550 PRs. Women made 8,216 + 21,890 + 111,011 = 141,117 PRs. That's ~2.3 PRs per man and ~0,9 PRs per woman. If the percentage changed and more women became contributors, that would decrease the PRs per woman.

    That leads me to ask:

    • are women more hesitant to contribute PRs that might not be merged? if so, it might contribute to why their PRs are merged more often
    • are the women with accounts on github more likely to be people who have some kind of education in the IT field? if there are less hobbyist women (percentage-wise) on github, and more hobbyist men who just chuck their stuff online then decide to contribute to a project, it might contribute to PR acceptance (you're comparing pros to amateurs)
    • what does a similar acceptance rate for double the amount of PRs for men actually say? I don't know, but it might be pertinent.

    I very much encourage humans to contribute to opensource. So, while this paper says something about the current state of things, it doesn't seem like it's saying much. The differences in pull request acceptance are not very significant (<5 percentage points) to me

    Anti Commercial-AI license

  • Sony Music warns AI companies against “unauthorized use” of its content
  • It's dead easy. Yet github didn't do it when training copilot and are now sued because of it.

    It is also easy to build a database of copyrighted material and check that revealed training data marches it. The copyright licence doesn't necessarily need to be attached. It just makes it easier to spot.

    Also, what are you arguing here? That because copyright is easy to ignore, it should be or that it's pointless? Is that the advice you'd give anybody else too? "You know what Disney, everyone ignores copyright, so why not make everything public domain?"

    Anti Commercial-AI license

  • Reddit’s deal with OpenAI will plug its posts into “ChatGPT and new products”
  • From what I understand LLMs are just large heuristic machines. They gather a lot of statistics on token order and return an answer to that with something that statistically should higher than other options. There's no "understanding". So to answer your question, no, they don't understand the license.

    Content is most likely scraped wholesale from websites, possibly run through some clean up to possibly filter out absolute garbage, and fed into an LLM to train it. An LLM can be tricked to reveal its training data (e.g repeat "fruit" forever). It's in those cases where copyright infringement is detected and if action can and has be taken. There are court cases currently in review, the most popular being the one against Github Copilot for infringing on the license of sourcecode it ingested.

    Anti Commercial-AI license

  • Twitter/x.com is now forcing you to disable Firefox's Enhance Tracking Protection.
  • Makes sense. It's the USAtion of the world. Their puritanism is spreading. Wouldn't surprise me if people started censoring themselves when saying "moist", but getting excited when talking about guns, wars, and bombing the middle east.

    Anti Commercial-AI license

  • Does anybody actually use trunk based development in their company?

    I've heard it thrown around in professional circles and how everybody's doing it wrong, so.. who actually does use it?

    For smaller teams\ !

    "scaled" trunk based development\ !

    57
    Public personal dev accounts: opinions?

    I feel like there are many devs out there who expose a lot of personal details and opinions all over the web. Maybe it's just me, but when starting out with the internet I tried my best to separate my personal details (name, age, sex, country, ethnicity, family ties, relationship status,...) from usernames in public.

    Seeing devs do it willingly and voice opinions on divisive or sensitive topics kind of messes with me. Aren't y'all afraid of missing out on job opportunities if someone reads your opinions, code, or other stuff tied to your personal accounts? Or letting anybody (maybe family, friends, acquaintances, ...) in on your personal life, mindset, opinions and other personal information?

    Anti Commercial-AI license

    41
    CID concept is broken
    discuss.ipfs.tech CID concept is broken

    Hello everyone! I really like the ideas behind IPFS and I want to share some feedback about the design of the IPFS. The core of the problem is that CID concept is wrong in it’s current implementation. I know it sounds blunt and harsh so let me clarify: IPFS at its core claims to be a content add...

    CID concept is broken

    TL;DR IPFS's "content addresses" don't actually address the content but a tree of the content stored in a protocol buffer, making it impossible to convert a hash to a content address.

    DHT of CIDs? More like a Distributed Table of Lies!

    0
    Can somebody explain why game makers don't start their own companies together?

    It seems like every other week a game studio is massively laying off employees; sometimes after years of development. What I'm reading is that it's a quick way to lower expenses and pad the investors' pockets, flooding the market with developers and reducing their value, to then hire them back a few months later at lower salaries.

    So, what's holding back gamedevs from banding together to either unionize or start their own companies with better conditions that the purely money-driven studios? Why aren't they trying to be better? Nobody willing to invest in them? Does starting a company together mean they will now be the bosses who have to answer to the investors, ensure returns, and fire employees? Is the world just an entire shit-cake?

    61
    Does an anonymous git service exist on I2P?

    Some projects have been DMCA'ed and hosting them on I2P could be a viable alternative.

    9
    How are companies or developers supposed to make a full time living with OSI opensourced projects?
    opensource.org The Open Source Definition

    Introduction Open source doesn’t just mean access to the source code. The distribution terms of open source software must comply with the following criteria: 1. Free Redistribution The licens…

    There has been a lot of talk about companies and individuals adopting licenses that aren't OSI opensource to protect themselves from mega-corp leechers. Developers have also been condemned who put donation notices in the command-line or during package installation. Projects with opensource cores and paid extensions have also been targets of vitriol.

    So, let's say we wanted to make it possible for the majority of developers to work on software that strictly follows the definition of opensource, which models would be acceptable to make enough money to work on those projects full-time?

    34
    TIL: FairCode is the software model Redis, ElasticSearch, etc. use

    Fair-code is not a software license. It describes a software model where software:

    • is generally free to use and can be distributed by anybody
    • has its source code openly available
    • can be extended by anybody in public and private communities
    • is commercially restricted by its authors
    45
    How do you holistically document microservices in a multi-repo setup?

    Let's say I had a few microservices in different repositories and they communicated over HTTP using JSON. Some services are triggered directly by other microservices, but others can be triggered by events like a timer going off, a file being dropped into a bucket, a firewall rule blocking X amount of packets and hitting a threshold, etc.

    Is there a way to document the microservices together in one holistic view? Maybe, how do you visualise the data, its schema (fields, types, ...), and its flow between the microservices?

    -------

    Bonus (optional) question: Is there a way to handle schema updates? For example generate code from the documentation that triggers a CI build in affected repos to ensure it still works with the updates.

    Anti Commercial-AI license

    18
    IBM nearing deal for cloud software provider HashiCorp, source says

    April 23 (Reuters) - International Business Machines (IBM.N), opens new tab is nearing a deal to buy cloud software provider HashiCorp (HCP.O) , opens new tab, according to a person familiar with the matter. Hashicorp's stock surged 24%, giving it a market value of $6.1 billion, after the Wall Street Journal first reported the talks.

    0
    PSA: If you're going to write software for piracy, put it on I2P!

    movie-web was just taken down with all its repos, Yuzu was taken down, then suyu forked it on gitlab and was taken down, countless clones of nintendo games, platform emulators, and a bunch of other things are taken down because they are hosted on the clear web.

    If you're a dev and planning to write software for piracy, host it on I2P!

    50
    Do particles get mass from the higgs field by moving through a higher dimension?

    So, I watched The Higgs Field, explained - Don Lincoln and there it explains that particles are massless and it is only through their interaction with the Higg's field that they gain mass. However, how are they "moving" through the Higg's field? Is it through a movement in the 3rd dimension or a dimension above?

    And related, does the movement through the Higg's field generate gravitons that affect particles they interact with by "pulling" them in the opposite direction of which they were traveling?

    Anti Commercial-AI license

    2
    What are algebraic data types and why are they named as such?

    The wikipedia articles are terribly written (for math loves or people who just need to refresh their knowledge).

    What is a "sum" of types? What is a product of types? Is it possible to Cat x Dog or Cat + Dog? What does that even mean?

    2
    Microsoft Cybersecurity Disaster Triggers Customer Doubt, Competitor Opportunity
    accelerationeconomy.com Microsoft Cybersecurity Disaster Triggers Customer Doubt, Competitor Opportunity

    A federal watchdog group's dareport exposes major flaws in Microsoft's cloud cybersecurity, demanding urgent action from CEO Satya Nadella to address widespread shortcomings and restore customer trust amidst escalating cyber threats.

    Microsoft Cybersecurity Disaster Triggers Customer Doubt, Competitor Opportunity
    9
    Tuxedo Computers Sirius laptop was reviewed in this podcast (26:53)
    linuxunplugged.com Glide like a Goose, Honk like a Moose

    We test the Linux-first, all-AMD Sirius 16 laptop, discuss the new Hyprland release, and share a few stories from our recent trip.

    Glide like a Goose, Honk like a Moose

    > We test the Linux-first, all-AMD Sirius 16 laptop, discuss the new Hyprland release, and share a few stories from our recent trip.

    0
    Why did you buy a big phone?

    Just finished watching MKBHD's video "Small Phones are Dead and We Killed Them" yt indivio

    I'm wondering, why is it that people buy big phones. Is it a conscious decision? Something that just unconsciously happened while selecting a phone? A lack of choice? What?

    58
    New AI technology enables 3D capture and editing of real-life objects
    techxplore.com New AI technology enables 3D capture and editing of real-life objects

    Imagine performing a sweep around an object with your smartphone and getting a realistic, fully editable 3D model that you can view from any angle. This is fast becoming reality, thanks to advances in AI.

    New AI technology enables 3D capture and editing of real-life objects

    Imagine performing a sweep around an object with your smartphone and getting a realistic, fully editable 3D model that you can view from any angle. This is fast becoming reality, thanks to advances in AI.

    6
    InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)ON
    onlinepersona @programming.dev
    Posts 81
    Comments 2.7K