Samsung Software Engineers Busted For Pasting Proprietary Code Into ChatGPT (pcmag.com) 65
Multiple employees of Samsung's Korea-based semiconductor business plugged lines of confidential code into ChatGPT, effectively leaking corporate secrets that could be included in the chatbot's future responses to other people around the world. PCMag reports: One employee copied buggy source code from a semiconductor database into the chatbot and asked it to identify a fix, according to The Economist Korea. Another employee did the same for a different piece of equipment, requesting "code optimization" from ChatGPT. After a third employee asked the AI model to summarize meeting notes, Samsung executives stepped in. The company limited each employee's prompt to ChatGPT to 1,024 bytes.
Just three weeks earlier, Samsung had lifted its ban on employees using ChatGPT over concerns around this issue. After the recent incidents, it's considering re-instating the ban, as well as disciplinary action for the employees, The Economist Korea says. "If a similar accident occurs even after emergency information protection measures are taken, access to ChatGPT may be blocked on the company network," reads an internal memo. "As soon as content is entered into ChatGPT, data is transmitted and stored to an external server, making it impossible for the company to retrieve it."
The OpenAI user guide warns users against this behavior: "We are not able to delete specific prompts from your history. Please don't share any sensitive information in your conversations." It says the system uses all questions and text submitted to it as training data.
Just three weeks earlier, Samsung had lifted its ban on employees using ChatGPT over concerns around this issue. After the recent incidents, it's considering re-instating the ban, as well as disciplinary action for the employees, The Economist Korea says. "If a similar accident occurs even after emergency information protection measures are taken, access to ChatGPT may be blocked on the company network," reads an internal memo. "As soon as content is entered into ChatGPT, data is transmitted and stored to an external server, making it impossible for the company to retrieve it."
The OpenAI user guide warns users against this behavior: "We are not able to delete specific prompts from your history. Please don't share any sensitive information in your conversations." It says the system uses all questions and text submitted to it as training data.
Each company an LLM (Score:3, Interesting)
After the mining bubble where miners were buying 2000 dollar RTX cards like they were hotcakes now NVidia gets to enjoy an AI bubble where companies buy 20k cards like they are hotcakes.
Re: (Score:2)
Yep, so much for GPU prices falling further, except for the ones with little VRAM that aren't as good for these tasks.
Re:Each company an LLM (Score:4, Informative)
Actually, AI is moving away from GPUs and toward dedicated silicon like TPUs and NPUs.
You can't buy TPUs, but you can rent them by the minute in the cloud.
Re:Each company an LLM (Score:4, Interesting)
Its the same companies buying them though, unlike miners it won't be joe scumbag buying them, ti'll be Amazon and Microsoft. So if Nvidia tries selling its gfx as AI processors, all Intel needs to do is come along with 10x performance Tensor processors and Nvidia will be begging us to buy its overpriced cards.
Re: (Score:2)
Kind of the way mining moved away from GPUs. If there's one thing that'll be impacted by AI is memory and that's in a slump already.
Re: (Score:2)
https://www.mouser.com/Product... [mouser.com]
Re: (Score:2)
Never mind the politics, this is a nerd site.
The real questions are: Did it find the bug? Did it optimize the code?
Re: Each company an LLM (Score:2)
Yes, it did. Otherwise, the others would not follow.
Re: (Score:1)
People dumber than ChatGPT (Score:3)
Quite an achievement to be sure.
Re:People dumber than ChatGPT (Score:4, Insightful)
No. If you have large number people in a group have them do things for a few days.. and then pick the three who did the most stupid things, you will get some incredibly stupid results.
It does not necessarily mean that those three people who did the most stupid things in my example are more stupid than normal people(though they might be), as everyone does something stupid from time to rime.
Re: (Score:3)
Pasting confidential info into a website? You have to be _very_ stupid to do that.
Re:People dumber than ChatGPT (Score:4, Insightful)
not at all.
Remember, the summary says "Samsung allowed chatGPT to be used after deciding this wasn't an issue"
So the employees used chatGPT as they were allowed by higher ups.
the higher ups then decidied that this was an issue so punished the employees for doing what they had been told was OK.
Its not stupid, its ignorance. Not everyone knows that ChatGPT keeps your prompts as part of its database, privacy and datamining are things and we expect websites (which chatgpt is effectively) to not steal your private data and use it as if it belongs to them. Expect more lawsuits over this.
Re: (Score:2)
Well, ignorance is certainly a part of it. But the reason for that ignorance really needs to be either stupidity or apathy.
Re: (Score:3)
Really? Companies allow - even require - their employees to paste confidential information into cloud services like Gmail, Office365 and Salesforce. There’s no reason in principle why generative AI is any different, except Open AI’s rapacious privacy policy.
Re: (Score:2)
Well "no reason" if one ignores the differences between the two, but that's too damned nuanced this early in the morning.
Re: (Score:2)
Any sane employee IT security policy will explicitly forbid that and things like Office365 become only allowed with a specific exception.
Re: People dumber than ChatGPT (Score:1)
Re: (Score:2)
You probably have never heard of medical data, banking data, specifically privileged personal data and high-value company secrets. In some countries putting specific data into O365 or GSuite can even a crime by the employee doing it.
Re: People dumber than ChatGPT (Score:1)
This isn't dumb this is over work (Score:2)
Re: (Score:2)
Actually, it is always a problem, but I agree that we have decidedly too many CEOs that should be behind bars or at the very least get fired immediately and then be made permanently unemployable.
That said, I recently had the pleasure to give a regulated financial processor a nice big "red" audit finding because they did not require their C-levels to use 2FA when travelling, but everybody else had to do it. They did not even try to argue that one, I wonder why.
What was the source for code leaks? (Score:4, Interesting)
Did OpenAI tell Samsung that the code was pasted into the chat window?
Did Samsung itself detect that an employee was pasting code into a window?
Re: (Score:2)
Did OpenAI tell Samsung that the code was pasted into the chat window?
Did Samsung itself detect that an employee was pasting code into a window?
It's possible the employees told someone, which then setoff the firestorm. They probably thought it was OK since Samsung had lifted the ban; and Samsung probably just assumed employees would know what not to put into ChatGPT.
Re: (Score:2)
deletion.
ripped from open a i documentation.
Can you delete my data.
yes.
please follow the data deletion process
Re: (Score:3)
I would bet money that Samsung itself detected this. It is pretty common for companies to have mandatory web proxies -- direct HTTP and HTTPS traffic outside the company is blocked unless it goes through a proxy that logs both directions of transfer. This protects against going to suspicious or unauthorized web sites, allows for malware scanning, and lets the company monitor for exfiltration of sensitive data.
Re: (Score:3)
Uh, no. Samsung lifted the ban on using ChatGPT. They did not lift the ban on sharing confidential information with external third parties. Their mistake was thinking that their employees would use one without violating the other. Some employees cannot tell the difference between "can" and "should".
We are not table (Score:2)
"We are not able"? Impossible is not a word that judges are going to take seriously.
If someone enters someone's personal data into a GPT system and it comes out elsewhere or it's all part of some longwinded revenge doxxing troll, and then a lawsuit prevails where a judge orders OpenAI to remove someone's personal data from the system, OpenAI -are- going to do it. They don't get a choice.
Re: (Score:2)
Do you suppose that Samsung have the time and resources to file a lawsuit in the US every time one of their engineers posts some random chunk of code into a chat window? Would they be happy with the publicity that dozens of lawsuits created?
And if a huge multinational like Samsung won't do it, who will? Remember, it was Samsung that said "data is transmitted and stored to an external server, making it impossible for the company to retrieve it" - not ChatGPT.
Re: (Score:3)
and then a lawsuit prevails where a judge orders OpenAI to remove someone's personal data from the system, OpenAI -are- going to do it. They don't get a choice.
Maybe. The system doesn't actually store someone's personal data, so they can't be expected to remove it. It stores information about how their personal data is similar to other data, but it doesn't retain the data itself.
Re: (Score:3)
"We are not able"? Impossible is not a word that judges are going to take seriously.
If someone enters someone's personal data into a GPT system and it comes out elsewhere or it's all part of some longwinded revenge doxxing troll, and then a lawsuit prevails where a judge orders OpenAI to remove someone's personal data from the system, OpenAI -are- going to do it. They don't get a choice.
While I agree they would make a good faith effort to comply, it may be easier ordered than done. Depending on how the algorithm combines data, much of the information used in the response may not be directly related to the doxxing target. OpenAI could easily remove information such as names, addresses, etc. but it is conceivable the information could be created without actual personal data of the target, based on inferences and public data sources. Directly linked data, such as X Y lives at Z, could be d
Re: (Score:2)
I think if the training was that direct/verbatim then ChatGPT would have become Tay already.
Company lifts ban ... (Score:1)
You couldn't make this shit up.
Some people are morons. (Score:2)
Re: (Score:2)
Maybe it depends on the code and the person writing it. I'm guessing that people asking ChatGPT for code are not good at writing it themselves, TFA mentions finding bugs and optimizing code. Maybe ChatGPT is not the best mentor in this case.
The last noob I worked with wrote terrible code, and his variable and function names had nothing to do with anything (literally stuff like $thing =). You could paste his 'proprietary' code into ChatGPT, you are not leaking anything or helping anyone.
Re: (Score:2)
You can be using wandbox or godbolt, and you STILL shouldn't paste any company code on there. It's called professional responsibility.
Re: (Score:2)
Not arguing *for* this kind of activity, you're right people should not be pasting private code into any browser. I'm just suggesting that its kind of a noob move and might reveal less than you think.
I suspect Samsung does some kind of DLP to detect this.
Re: (Score:2)
"It's right there in the EULA. You did read the EULA in its entirety, right? Just like you do for every web site that you use? Really???? You don't???? What kind of moron..." Frankly, it isn't moronic behavior to use a tool you've been told you can use, especially when most of the other cloud tools that can be used at work have "enterprise" versions that properly isolate and protect a company's data. Unless the access to ChatGPT came with some serious training on data handling, this was a pretty predictable
Re: (Score:2)
And no, you don't need to read the EULA. You ASSUME that there's nothing that will protect you.
I write C++ for a living. wandbox and godbolt are the go to tools of most C++ programmers. I don't paste any company code into wandbox or godbolt, and I don't read their EULA. I don't care about what their EULA says. I only ever either type new code, or some code from StackOverflow, into it.
especially when most of the other cloud tools that can be used at work have "enterprise" versions that properly isolate and protect a company's data.
I literally said, in the FIRST SENTENCE:
I don't know how anyone can think pasting any proprietary code into any web service you don't control is a good idea.
Learn to read.
Re: (Score:2)
Most people are not going to be able to distinguish which services the company does not control from the ones the company does control. It used to be really simple to just say "do not post our proprietary information online". But now it is all online, and without training, it's impossible to expect most people to unravel which is safe or not, so we block things at the firewall and whitelist the systems that are ok. Samsung whitelisted ChatGPT. This result should have been expected even from completely compe
Re: (Score:2)
> Most people are not going to be able to distinguish which services the company does not control from the ones the company does control.
I seem to have more faith on the mankind than you do. Companies like Samsung probably don't hire a lot of people below the center of the bell curve. There are various ways to identify if a service is company-provided or not, such as is it company-branded, is it hosted under company domain, does it have company login.
Not blocked by the proxies? I guess that makes it look
Re: (Score:2)
> I seem to have more faith on the mankind than you do.
Heh. Perhaps. My particular faith is backed by a lot of evidence. My company hires really high on the skill level for each particular job, but to maintain security, we've had to go to explicit allow lists for pretty much the whole Internet. Sites are banned by default until review. There's some unexpected loopholes (Stack Overflow being the big one), but mostly it's pretty hard to upload company data to a non-approved site without really being intent
Why is anyone surprised by this? (Score:4, Insightful)
The current version of AI has been sold as a magic fix for every problem, and laziness and stupidity do the rest. Technical types are some of the worst offenders because they are true believers who can't imagine that something so high-tech and cool is fundamentally flawed and untrustworthy. You don't think so? Just look at the drivel the permeates every topic here on Slashdot.
And there it is (Score:2)
A much simpler but similar effect is Keyword gaming to make certain the results you want show up first in the search window. Things like ChatGPT will just have the intentional pecuniary or political bias hid deeply and no where the average user can see or understand it.
Pretty inevitable (Score:5, Insightful)
Code monkeys gonna do what code monkeys gonna do.
"Please debug this for me, and figure out why the include file with the API password isn't getting loaded properly. I'll paste the whole file so you have all the info."
Re: (Score:3)
Exactly. HR better start requiring in-person coding tests for prospective hires.
Re: Pretty inevitable (Score:2)
Hehe. I did that once. It was six stapled pages of intentionally dumbed down and buggy c++ that was peripherally related to the task we were hiring specifically for.
Some guys got most of it, some made arithmetic mistakes or didn't quite meet the reqs of the prompt, but one dude took a look at it and proudly declared, "This is terrible code. You can get someone at half my rate to write this kind of code!"
He was an older white dude who I infer from context didn't actually know how to code, but maybe he jetted
Re: (Score:2)
Samsung makes good hardware, but generally speaking, the software is of questionable quality.
Re: (Score:3)
Samsung would be better setting up their own local version just like some set up their own personal cloud. Administered the same way for both.
Feed yourself into the code-chipper (Score:4, Interesting)
I've looked more at images and conversations rather than code, but to me the difference between model output and human output is really quite obvious. Model output is all corporate-speak. It's entirely like reading something written by Marketing. I suppose code might be more human-like, given that code is such an intentionally dumbed-down language.
Re: Feed yourself into the code-chipper (Score:2)
I refuse to register so I've never seen it. But a guy at work swears by the bash scripts it makes to scrape web content.
I see (Score:2)
So the same crime as putting the code into Microsoft Word or sending it with Gmail?
Re: I see (Score:2)
Didn't OpenAI promise not to use the user data? (Score:3)
Use ChatGPT in the real world for real work? (Score:2)
Including the #GSOD code ? (Score:2)
I wonder if ChatGPT would be able to fix, or at least find proper workaround for the #GSOD problem [youtube.com], since @Samsung @SamsungMobile #Samsung #SamsungMobile have long forgotten what a client is, and won't acknowledge their responsibility.
Hmmm .. (Score:2)
Samsung Android tablet applications (Score:3)
If you don't know what caused a bug (Score:2)
You don't know your changes have fixed the bug.
I had a programmer on my team once who would get frustrated trying to find the source of a bug. Once, he was writing code that spit out PDF documents. He was trying to fix a problem that sometimes caused the top margin to be 1 inch *above* the top of the page. So he added a line that said "If the margin is -1 inches, make it +1 inch." But fixed, right? After all, the problem stopped happening in his test case!
Nope, the problem was still there, it was just burie
Training ChatGPT on broken code? (Score:2)