The Open-Source Detector 340
McDutchie writes "With open-source related lawsuits on the rise, a
market is developing for automated tools that detect the presence of open-source code within larger
application development environments.
Palamida Inc.
stepped in with IP Amplifier 3.0,
essentially a search tool and a database that consists of more than 38 million
of the most commonly used open-source files. Something Google-inspired called
CodeRank is claimed to match code against the database. Hmm...
maybe
someone should run it on
this,
or even
this." Of course, some open source code is perfectly welcome in commercial software, even if that software's code is not itself open; it's no secret or surprise that Microsoft, for instance, has taken advantage in some products of BSD-licensed code.
I wonder... (Score:4, Interesting)
For example, one could write a bug-filled line of code, perhaps something with a buffer-overflow. This could then be matched with open-source projects and projects with buffer overflows are found. Of course, this could also be used to find vulnerabilities and so on.
Re:I wonder... (Score:5, Insightful)
Re:I wonder... (Score:5, Informative)
It's a widespread and unfortuate myth that your product automatically becomes subject to the GPL if you (accidentally or otherwise) violate the GPL by including GPL'ed code. In such a case, a copyright violation has been committed and you have to remove the code in question, and possibly pay damages -- but your product will not become open source (unless, of course, you choose to make it open source as a way of remedying the license violation).
Re:I wonder... (Score:2)
Re:I wonder... (Score:2)
Finally, one could presumably just assert that it was under the GPL all along and you are just now releasing the code.
Re:I wonder... (Score:3, Insightful)
Correct.
This line of argument seems to be along the lines of "of course you can use GPLed code - just don't get caught", and it's always worried me. Correct me if I'm wrong, I frequently am!
No, that's not what it means. What it means is that the penalties and consequences of violating the GPL are not automatically that your source code itself falls under the GPL. In fact, placing yo
Re:I wonder... (Score:3, Informative)
Re:I wonder... (Score:2)
The BSD license argument (Score:5, Interesting)
>Of course, some open source code is perfectly >welcome in commercial software, even if that >software's code is not itself open; it's no secret >or surprise that Microsoft, for instance, has taken >advantage in some products of BSD-licensed code.
This example (socket code) often pops up, and is often used in GPL advocacy.
Note however that the TCP/IP work was done under a DARPA grant, paid for by the US government, so it is not only legal, but even moral right for Microsoft to use this code.
Re:The BSD license argument (Score:3)
Granted. However, if they do so, their horse isn't so high when they harp on and on about having strict intellectual property controlls. *They* benefit from the work of others, how can they call it a cancer?
Re:The BSD license argument (Score:3, Insightful)
Because the GPL spreads out to affect more than just the GPLed code that was originally introduced and its subsequent modifications.
Re:The BSD license argument (Score:2)
But of course you accepted the license when you used the code so that shouldn't cause you any problems. It's entirely voluntary. If you decide you want to release your code, but not GPL it, you can just replace the GPL code with more of your own.
Re:The BSD license argument (Score:3, Informative)
Indeed. Of course, "combined" in GPL-speak can mean "linked", so you can end up with code completely unrelated to any GPLed code having to be GPLed because it's magically become "combined" with the GPLed code.
As I said, the problem is the GPL
Re:The BSD license argument (Score:2)
Microsoft does not have the moral right to use it because it prevents the exact same thing from happening again. It seems to concentrate on shoveling money from governments (US included) into it's bank even after reaping the benefits of public funded open software.
The obvious double standards is what we look down upon.
Re:The BSD license argument (Score:2)
Re:The BSD license argument (Score:2)
Re:The BSD license argument (Score:3, Insightful)
Stop thinking small! (Score:3, Insightful)
Not only that but whenever I've been present when someone has asked the people who wrote the code if it's OK for Microsoft to use it, they didn't say "we can't stop them", they said "we want them to use it".
I don't see how you can possibly come up with a more ethical or moral justification for it than that.
Re:Stop thinking small! (Score:3, Interesting)
I'd say that's a good argument for them being prevented from using any open-source of public domain project. After all, it is communism...
But yeah, the point of the BSD license is to get closed-source companies like MS to use the standards. They in no way deserve it, but it's in everyone's best interests that they do.
Re:The BSD license argument (Score:2)
Love them of loathe them, MS generates a huge amount of money. That can only be good for the economy, and so for the country as a whole.
Re:The BSD license argument (Score:3, Interesting)
What have they contributed? How has any Microsoft product ever made a business run better than the average competitor's product? But they certainly charge more, restrict more, lie/cheat/steal more, sue over invented infringment more, and hold back the industry more.
It's in everyone's interests to commo
Re:The BSD license argument (Score:3, Insightful)
They didn't make the office suite mainstream, that was already happening. Sure, it kept happening while they were around, but it's not like they made something happen th
high costs? (Score:4, Interesting)
That seems rather steep. Are they doing something really complicated or is this something that a well-maintained (open-source?) project could do? Of course they are storing a major amount of information (i.e. all of sourceforge/freshmeat).
This might in fact be a feature that sourceforge might want to implement (for a fee): doing a search in their database.
On the other hand, it might make more sense to check against proprietary source, data and images. They are, by their nature, harder to find.
Also: when outsourcing parts of a project, wouldn't a contract have to state explicitly conditions such as not stealing/borrowing code from elsewhere? It would be a minimum requirement that the licensing of any (sub-)code would have to fit the overall product.
Re:high costs? (Score:2)
But, like some other folks have said, the hard part is keeping all the open source code handy for comparison purposes...
Be careful of FUD (Score:4, Insightful)
This seems to be a resurrection of an old attack strategy, pretend that open source is such an burdensome onerouse license that you have to hunt open source code down like a virus.
Its not something to be encouraged!
sigh (Score:4, Insightful)
I just think it's pathetic that we live in an era where people trying to do something nice gets stabbed in the back for it..
something about this dosn't make me as happy as .. (Score:4, Informative)
Now its wonderfull theat they help people get the most out of OSS software but i dont like the fact they are making outsourcing easier .This is not so much a problem where i live but in the USA as i understand it many people are loosing their jobs in the tech industry thanks to companys trying to save a fair bit by outsourcing to cheaper areas .
Again my second problem is there strong patent support here .It just makes me as someone who uses and contributes to OSS uneasy.(just my opinion and how i feel , not a statment of fact )
On to the legal section ,Their bussines model is basicaly that of enforcing IP rights , sure that may help us find companys abusing GPL code , but it also swings both ways and can open up a whole host of patent cases against GPL software.
Fair enough this can be usefull in this day and age , allowing you to pay them to make sure your not infringing on any patents , But this just dosn't work on 90% of the OSS projects out there , i am betting it costs a fair whack.Most people using this on OSS are IMHO going to be looking to enforce a patent case ala SCO.The potential minefield here is not fun.
Now that is alot better ,I can strongly respect what they are doing here .Still i dont like that they keep harping on about IP compliance..
I am probably just being paranoid an
Will probably find many blatant violators. (Score:5, Interesting)
We certainly would have violated the GPL in a second, given that one couldn't really prove damage to the other party (aging idealist hippies with beards who were naive enough to give away software with a silly "license").
The ripoff of commercial software was driving me nuts though -- it seemed quite wrong, esp. given that we were raking in the dough and were not paying just because we could easily avoid it through technical measures.
However, part of the "culture" was that we were so busy that we were sloppy about the misdeeds. We wouldn't have had time to cover our tracks.
Such tools would have caught us, so I'm guessing such tools will lead to finding many similar violators.
Re:Will probably find many blatant violators. (Score:2)
Re:Will probably find many blatant violators. (Score:2)
Re:Will probably find many blatant violators. (Score:3, Interesting)
That's interesting. I wonder what the legal position would be if it was transparently obvious that, rather than being an honest mistake or result of one lazy/crooked employee, the inclusion of GPLed code was quite deliberate, as a consequence of (what would be obvious when one or more vi
Searching 190,006,436 lines of code (Score:2)
good, but not new (Score:2)
Contrary to the company's claims of being "groundbreaking", that's not new: plagiarism detectors, code duplication detectors, etc. have been around for a while.
For those in the dark side of the force, (Score:4, Insightful)
Binary Checker? (Score:2)
One early thought is that you could scan for matching arithmetic operations. Walk through the assembely and keep a table of register contents/memory contents/constant loads to regenerate algabraic operations. By transforming these operations to some canonical form one could match algabraic operations
You know it's copied when... (Score:3, Interesting)
It was sockets in C. The code was very poorly written, it actually contained a couple of GOTO statements. One of the files contained a typo in the commenting, so I figured... Let's google it!
And wouldn't you know it, several hundred results.
I'm not sure what I was angry at: Our lecturer not giving any indication that she didn't write the code, or not citing her sources, or giving us such crappy code to start with...
But needless to say, I was angry.
So, to tie this to the topic, nothing works better than searching for typos!
- shazow
Re:windows already has some (Score:3, Informative)
Re:windows already has some (Score:2, Interesting)
Re:windows already has some (Score:3, Insightful)
The BSD goal is good code, not open code.
Re:windows already has some (Score:4, Informative)
This is why some people love the BSD license as they see it as total freedom and i have much respect for it myself
I just prefer the GPL way as we get back any changes and thats gaurenteed by the license(if the software is released , i belive its ok not to feed the changes if its an internal tool only)
Re:windows already has some (Score:4, Insightful)
You have confused Open Source with GPL. There is nothing wrong with using Open Source in applications as long as the license permits it.
Why should Microsoft be singled out for it? Expecially when we had people taking GPL'ed code and selling it as closed source...
Re:windows already has some (Score:2)
Re:windows already has some (Score:2)
Re:windows already has some (Score:2)
Re:windows already has some (Score:2)
Whether you think it's good or bad is irrelevant. The GPL is less free than BSD because it does not grant the licensee as many freedoms.
Re:windows already has some (Score:2)
Re:windows already has some (Score:4, Insightful)
No, the GPL is more free because it does not permit anyone to take away anyone else's freedom. Say I write some GPL code. You are free to use it, modify it, sell it if you want, but you may not tell any later user or developer that they can't enjoy the same freedoms you have enjoyed.
Scenario 1: Person A writes some GPL code. Person B uses it and modifies it, and releases the code. Everyone else is free to use that code as they wish, as long as they don't try to restrict anyone else's rights.
Scenario 2: Person A writes some BSD-licensed code. Person B uses it, modifies it and starts selling it as a shrink-wrapped product. All his users are restricted by EULAs. They can't have the source code, they can't legally share the program, and they're stuck if B discontinues the product.
In which scenario do you think the licensees have more freedom? It's free as in liberty, not free as in 'free ride'.
Re:windows already has some (Score:4, Insightful)
It is very simple: the BSD license is more free, because it grants more freedoms.
Yes, to take this to its logical extreme means that anarchy is maximum freedom. No, this would not be a good thing; but by trying to argue that the GPL is more free (when you should have said that it is better for the user of Person A's software) you have already accepted that unlimited freedom isn't such a good thing anyway.
Re:windows already has some (Score:2)
GPLed code is ultimately about forcing all users to abide by certain rules, with little choice (yes, you can choose not to use the code, but that is really your only choice. With BSDed code, you have that choice, to do with it as you please, to let others or not, it's all up to you, as it should be, without someone else forcing you to do their will.
Re:windows already has some (Score:2)
Re:windows already has some (Score:2)
Re:windows already has some (Score:2)
The BSD license allows you to use the code freely. It also allows you to remove that freedom from others, by converting the code into a closed-source product. So it gives you one additional freedom: the right to deny freedom to anyone else.
Obviously, people who want to make use of this additional freedom are very much in favour of it. Those on the receiving end of proprietory software tend to be less well disposed to giving away this
Re:windows already has some (Score:2)
Re:windows already has some (Score:2)
Re:windows already has some (Score:2)
If I write a big open source application, I will license it under the GPL, because I want *everybody* - not only the people who got the software from me, but also the people who got the software from a third party - to benefit from the same freedom. How is this "less" free than allowing third parties to not pass the same freedoms to other?
Your freedom ends where others' begin.
Re:windows already has some (Score:2)
At which point does removing freedoms "to ensure the freedom of others" (ie: what the GPL does over the BSDL) stop "ensuring freedom" and start oppressing ?
Re:windows already has some (Score:2)
Re:windows already has some (Score:2)
Re:windows already has some (Score:2)
That's why I really prefer GPL and especially LGPL.
With LGPL, you can use portions of my code in your proprietary programs, but I get testing and bug fixes in turn. If my code is helping someone, why wouldn't that person help me?
If the BSD stack was LGPLed, Microsoft would still be free to use it, but at least it would have to cooperate with BSD. That would make them a lot more likely to keep their sources synced with the original tree, and thus pull in any fixes. Can y
Re:windows already has some (Score:2)
Re:windows already has some (Score:2)
Re:windows already has some (Score:2)
Hint: it's not about lifting FLOSS code.
Re:windows already has some (Score:2)
The person who I care about the most, is me. If I do something for you, I expect something in return. Be that money, fame, bug reports, improvements or even just satisfaction -- doesn't matter. But, I would hate it if my efforts are used against me.
GPL violations! (Score:3, Insightful)
Re:GPL violations! (Score:2, Insightful)
Re:Bah... humbug. (Score:3, Insightful)
IMHO, this is quite an innovative tool, and would sa
Re:Bah... humbug. (Score:3, Funny)
This is so that you can detect OS code in your own source code. Presumably if you're managing a commercial software company you'd want to know if your developers have simply been copying code from some OS project.
It can do binaries too if you actually read the thing.
Now if you'll excuse me, I have some code I need to obfuscate
Re:DLL encryption will render this ineffective (Score:5, Insightful)
Re:DLL encryption will render this ineffective (Score:5, Insightful)
Can anyone explain this to me?
Re:DLL encryption will render this ineffective (Score:2)
Re:DLL encryption will render this ineffective (Score:3, Insightful)
Simple... (Score:4, Insightful)
Kjella
Re:DLL encryption will render this ineffective (Score:5, Funny)
Muscle memory?
Re:DLL encryption will render this ineffective (Score:4, Informative)
He needs to implement a specific piece of functionality and fast. He searches the web and finds some 'sample' code and thinks "just the job".
Copy.. paste..
You now have GPL code in your application, copied and pasted direct. Why? Malicious and callous hatred of free software? No, an accident. Carelessness. A quick fix in a tight spot.
It happens. I've seen it.
Re:DLL encryption will render this ineffective (Score:2)
Re:DLL encryption will render this ineffective (Score:4, Insightful)
And of course it can be done by examining the memory dump instead of executable file. It must be decrypted to run.
Re:DLL encryption will render this ineffective (Score:3, Informative)
The code must be decrypted at some point in order to be run. If what you said was true, we would have uncrackable copy protection.
Your scheme is a variant of DRM, and like all DRM schemes is fundamentally flawed, because the person you are trying to keep the data from, is the exact same person that you are making the data available to.
Re:No Gurantee Against reimplentation (Score:5, Informative)
Um, last time I checked, this is a quite reasonable approach. You can paraphrase your book report in school, you can paraphrase your predecessor's speech, you can take photographs from famous vistas, and you can rewrite your own closed code inspired from Open Source algorithms.
Source code is protected by copyright-- that is, literal or near-literal copies containing the essence of expression. Open Source code doesn't require that reverse engineering must be done in a clinical clean-room black-box methodology. That's kinda the POINT of Open Source: show people how it's done.
Re:No Gurantee Against reimplentation (Score:2)
Re:No Gurantee Against reimplentation (Score:4, Informative)
Further, not everything that takes time is wasteful. Copyright is intended to protect the expression of ideas, not the underlying ideas. Thus, you don't protect the idea of love or even the words I love you, but you can protect the expression of love and the words I love you in the context of lyrics to a song possibly with a musical score.
Re:No Gurantee Against reimplentation (Score:3, Informative)
The point of copyright is to let people derive commercial rewards from the expression of ideas; copyright does not protect the ideas themselves.
(I apply this word here to code as well as other textual material) is alright, even though fundamentally it's the same thing, only more time-consuming;
No, it's not "fundamentally the same thing". There have been thousands of Mary-with-baby pictures. It's the expression--the actual painting
Re:No Gurantee Against reimplentation (Score:2, Insightful)
That's fine. Algorithms cannot/should not be copyrighted or patented.
Re:No Gurantee Against reimplentation (Score:2)
Re:No Gurantee Against reimplentation (Score:3, Insightful)
I wouldn't be so sure about that. Reputable colleges and universities do exactly that sort of check in CS courses - there are any number of tools designed to check for cheating, and they are not fooled by anything so trivial as changing variable names or swapping a couple statements. They are pretty good at catching cheaters, too.
You are correct in that it
Re:No Gurantee Against reimplentation (Score:2)
Ah, you must have missed that memo. It's the one that says that, once you implement one version of an idea, you own all possible implementations of it. It's the same one that said software patents are a great idea, "Intellectual Property" really exists and something about AYBABTUSC.
Re:No Gurantee Against reimplentation (Score:5, Insightful)
What the fuck are you talking about ?
GPL is a based on copyright. You can't copy/paste the code.
Re-implementing the algos is fine, and have always been.
It is 100% FUD to pretend that code become tainted because you looked a GPL source. Don't spread this. Microsoft would LOVE people to beleive that. It would end up like this in interviews:
- Did you contributed to an open-source project ?
- Well, I once fixed a bug in mozilla
- Sorry, our lawyers said we can't hire you
- Why ?
- You would contamine our IP
Repeat after me. GPL is COPYRIGHT. There is no IP involved. There have NEVER been.
Re:No Gurantee Against reimplentation (Score:2)
Re:No Gurantee Against reimplentation (Score:4, Interesting)
Re:No Gurantee Against reimplentation (Score:2)
Re:No Gurantee Against reimplentation (Score:3, Insightful)
Copyright does not require a cleanroom implementation. Patents do. Open source code is not patented.
Re:No Gurantee Against reimplentation (Score:2)
secondly, it's pretty damned easy to detect (using computerised algorithms) that someone has changed variable names, stylistic differences etc. That is very very easily done.
Re:No Gurantee Against reimplentation (Score:2)
The first thing to do when comparing two source files is throw out all variable names and stylistic choices and convert them to a specific format and style. This means it doesn't matter if you change "speed" to "velocity" it's still trivial to catch automatically. It also doesn't matter if you go:
int main()
{
char * message = "hello world";
printf("%s", message);
}
or
int main() { char * message = "hello world"; printf("%s", message); }
it means exactly the same thing
Re:No Gurantee Against reimplentation (Score:5, Insightful)
Good. So long as all they are doing is gathering ideas there is nothing wrong with that. Its like me reading harry potter and then writing a book about wizards. Of course I should be allowed to.
Next you'll be telling us that someone could just look at an application working and then write their own implementation incorporating some of the same ideas. Should they be stopped from that as well? Oh wait, they can be. That's what software patents are often used for.
as it should be (Score:2)
With commercial source code ("community license", "shared source license", etc.), companies usually try to attach restrictions on your ability to re-implement the APIs, or even on your ability to compete with them. Sun's Java licenses are an example of such behavior.
That's
Re:Email from the net nazis (Score:2)
Re:Ouch. (Score:2)
Re:Ouch. (Score:4, Informative)
No, they can't. Stop spreading this myth.
Re:Lets do it the other way: the "de-OSS'ifier"... (Score:2)
Re:Trolling by submitter (Score:2)
Re:Trolling by submitter (Score:3, Interesting)
The submitter's article did not state that the submitter assumed that there was GPL'd code in MS products.
"On the other hand, what I would like to know is how many OSS projects reverse engineer Microsoft products to implement functionality"
Why do you believe that any laws or the EULA were broken by people implementing any funtionality in GPL'd software? If t