Over 3.1 Million Fake 'Stars' on GitHub Projects Used To Boost Rankings (bleepingcomputer.com) 22
Researchers have uncovered widespread manipulation of GitHub's star-rating system, with over 3.1 million fraudulent stars identified across 15,835 repositories, according to a new study by Socket, Carnegie Mellon University, and North Carolina State University.
The research team analyzed 20TB of data from GHArchive, spanning 6 billion GitHub events from 2019 to 2024, using their "StarScout" detection tool. The tool identified 278,000 accounts engaging in coordinated inauthentic behavior to artificially boost repository rankings.
GitHub uses stars, similar to social media likes, to rank projects and recommend content to users. The platform has previously encountered malicious exploitation of this system, including the "Stargazers Ghost Network" malware operation discovered last summer. Approximately 91% of flagged repositories and 62% of suspicious accounts were removed by October 2024.
The research team analyzed 20TB of data from GHArchive, spanning 6 billion GitHub events from 2019 to 2024, using their "StarScout" detection tool. The tool identified 278,000 accounts engaging in coordinated inauthentic behavior to artificially boost repository rankings.
GitHub uses stars, similar to social media likes, to rank projects and recommend content to users. The platform has previously encountered malicious exploitation of this system, including the "Stargazers Ghost Network" malware operation discovered last summer. Approximately 91% of flagged repositories and 62% of suspicious accounts were removed by October 2024.
No one goes to the internet .... (Score:2)
Perhaps I'm wrong here, my observation is that it's pretty rare to see someone have a great experience and say "I'ma post that on the internet!" but have a karen fail in the "I want to see your MANAGER!" and it's a 10,000 post screed.
If it's on the internet, pretty much I count on it being enshittified and wrong. That's the attitude that 30 years of ToS work in conjunction to "normal" work. Oh, by the way, when you do post "this is how to get this to work/setup" may want to ensure there's a date, and especi
Re: (Score:3)
Look again. The flurry of faked stars on projects whose sponsors paid such "media enhancement" experts to publish astroturf reviews is growing as rapidly as they can build botnets and flood the channels.
But Why? (Score:5, Interesting)
Why would anyone bother or care about this? You aren't going to github to look for a high ranked project to work on, you go to github to find a stable, properly-licensed, fully maintained project to either install or work on.
You don't go there and click the highest rated project to look at, it's not youtube or facebook. Something else is going on here that we probably don't know "why" this is happening. I suppose there is always the possibility of non-legitimate AI projects trying to push their SEO on google by having their projects have more visibility in the github search engine, but still. Why? Why would you download a project that isn't the thing you need?
Re: But Why? (Score:3)
I can only think kf somethingalicious. Maybe you are looking for tricking people into installing your version of i instead of upstream. By massive staring your fork, you might get enough people confused that you may gain some users to download your library because they search on github i stead of finding the library main page and following their github link.
That may get you some backdoor into some machines. You can't target a particular one. But you probably can botnet a few like that
Need to define "fully maintained project" (Score:3)
Ignoring the stars as it's somewhat of a misdirection.
How would you vet an open source project in general for use on you corporate project? Assume an internal, not resold or distributed to others project for simplicity.
Re: (Score:2)
How would you vet an open source project in general for use on you corporate project?
Assuming it's not a well known project, I would look at the source code. Incidentally, the source code for LZO is quite good but [oberhumer.com] also quite an eye opener.
Stability over time also matters, as well as potential to maintain the project in-house if the maintainers disappear.
Re: (Score:2)
Why would anyone bother or care about this? You aren't going to github to look for a high ranked project to work on, you go to github to find a stable, properly-licensed, fully maintained project to either install or work on.
You don't go there and click the highest rated project to look at, it's not youtube or facebook. Something else is going on here that we probably don't know "why" this is happening. I suppose there is always the possibility of non-legitimate AI projects trying to push their SEO on google by having their projects have more visibility in the github search engine, but still. Why? Why would you download a project that isn't the thing you need?
Taking a wild stab at a guess here: Maybe the AI trainers or some ranking algorithm somewhere uses a star rating. I could see star rankings mattering to the scammier investment types looking for code to steal or feed to their snappy new AI everything programming solution blahdeblahblah. Once they bigger tech corps have managed to push human involvement completely off the web and it's just AI feeding AI from historical human creations, those ratings will probably mean a whole lot more than they mean today.
Or
Re:But Why? (Score:5, Informative)
Corporate HR uses stars for hiring decisions in many cases.
Maybe H1-B exceptions do too?
I recently learned that statute provides for 85,000 H1-B visas per year, plus room for special exceptions, and the "exceptions" are about 715,000. This would obviously fail the "Major Questions Test" if challenged.
A guy on Twitter downloaded the entire DoL database and did analytics on it. Several recent news stories are using it.
Everything is gamed that can be, it seems.
Re: (Score:2)
I personally use the star system as a way to bookmark interesting github projects. If I come across a project that seems interesting but I have no use for it, I give it a star. I can then go through the list of stars and have a list of projects.
Thus when I get to then "I heard of a project that does X" I can go through the list and find it.
There aren't really many other ways to remember a project. I suppose you could fork them, but that then clutters up your repository list and you can run into space quota
Re: (Score:2)
There aren't really many other ways to remember a project. I suppose you could fork them
You don't need to rely on ad-hoc methods using site-specific features like that. Most browsers have a 'bookmark' feature that lets you easily save the address of a particular page along with a short description. Most browsers also offer a way to organize these "bookmarks", making them easy to find later. Some browsers will even use your bookmarks as suggestions when using the address bar. Bookmarks are also easy to transfer between most browsers, so you don't need to worry about vendor lock-in. Some br
Re: (Score:2)
Bookmarks don't port to other machines, and are often flushed by the "scrub your browser settings" cleanups advised by helpdesk personnel.
Re: (Score:2)
Bookmarks don't port to other machines
Nonsense. Of course they do. I have bookmarks going back to the mid 90's that have traveled effortlessly from machine to machine.
and are often flushed by the "scrub your browser settings" cleanups advised by helpdesk personnel.
Nonsense. Not even the dullest helpdesk drone would tell you to delete your bookmarks!
Re: (Score:2)
Bookmarks port when you have a shared, centralized account or similar configuraion consistency tool running. It's not automatic for distinct workstations or working environments.
The "dullest helpdesk drones" do this as a matter of course to clear any individual configurations, and to discourage you from calling them again with minor problems.
Re: (Score:2)
The "dullest helpdesk drones" do this as a matter of course to clear any individual configurations
Bullshit.
Re: (Score:2)
You may not. The first listed project will get a lot more attention, even if it's not ideal for your needs on further review.
scouting_irony.git (Score:3)
The tool** identified 278,000 accounts engaging in coordinated inauthentic behavior to artificially boost repository rankings.
** This is when we find out the tool, was found on GitHub.
5/5 stars I hear. No shit.
Is nothing sacred anymore? (Score:2)
Just think of the children!