Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)LI
Posts
1
Comments
287
Joined
2 yr. ago

  • I'm sick and tired of this "parrots the works of others" narrative. Here's a challenge for you: go to https://huggingface.co/chat/, input some prompt (for example, "Write a three paragraphs scene about Jason and Carol playing hide and seek with some other kids. Jason gets injured, and Carol has to help him."). And when you get the response, try to find the author that it "parroted". You won't be able to - because it wouldn't just reproduce someone else's already made scene. It'll mesh maaany things from all over the training data in such a way that none of them will be even remotely recognizable.

  • They have the right to ingest data, not because they're “just learning like a human would". But because I - a human - have a right to grab all data that's available on the public internet, and process it however I want, including by training statistical models. The only thing I don't have a right to do is distribute it (or works that resemble it too closely).

    In you actually show me people who are extracting books from LLMs and reading them that way, then I'd agree that would be piracy - but that'd be such a terrible experience if it ever works - that I can't see it actually happening.

  • That same critique should apply to the LLM as well.

    No, it shouldn't. Instead, you should compare it to the alternatives you have on hand.

    The fact is,

    • Using LLM was a better experience for me then reading a textbook.
    • And it was also a better experience for me then watching recorded video lectures.

    So, if I have to learn something, I have enough background to spot hallucinations, and I don't have a teacher (having graduated college, that's always true), I would consider using it, because it's better then the alternatives.

    I just would never fully trust knowledge I gained from an LLM

    There are plenty of cases where you shouldn't fully trust knowledge you gained from a human, too.

    And there are, actually, cases where you can trust the knowledge gained from an LLM. Not because it sounds confident, but because you know how it behaves.

  • Why is that a problem?

    For example, I've used it to learn the basics of Galois theory, and it worked pretty well.

    • The information is stored in the model, do it can tell me the basics
    • The interactive nature of taking to LLM actually helped me learn better than just reading.
    • And I know enough general math so I can tell the rare occasions (and they indeed were rare) when it makes things up.
    • Asking it questions can be better than searching Google, because Google needs exact keywords to find the answer, and the LLM can be more flexible (of course, neither will answer if the answer isn't in the index/training data).

    So what if it doesn't understand Galois theory - it could teach it to me well enough. Frankly if it did actually understand it, I'd be worried about slavery.

  • have no thoughts

    True

    know no information

    False. There's plenty of information stored in the models, and plenty of papers that delve into how it's stored, or how to extract or modify it.

    I guess you can nitpick over the work "know", and what it means, but as someone else pointed out, we don't actually know what that means in humans anyway. But LLMs do use the information stored in context, they don't simply regurgitate it verbatim. For example (from this article):

    If you ask an LLM what's near the Eiffel Tower, it'll list location in Paris. If you edit its stored information to think the Eiffel Tower is in Rome, it'll actually start suggesting you sights in Rome instead.

  • At the early stages of the legislative process, every time this was brought up people kept saying, "it's fine, they've excluded non-commercial open source". Now it seems there are problems with what might count as "commercial".

    But at this stage of the process, the EU legislators can't make arbitrary amendments. There are two versions of the text now - one proposed by the Parliament, and one by the Council - and the final text must be a compromise between those two. It can still be rejected by Parliament, but that would be rejecting it in whole (and they won't do that).

  • Actually, reporting issues is not considered a bad practice in open source. If the corporation expects the dev to work for free, that's a problem. But I found the original bug report, and it's just a normal report. It doesn't read entitled, doesn't demand "Fix it NOW!!!", simply explains an issue.

  • From the OP, it seems the filters don't flag CSAM. They flag any NSFW. That said, keep in mind that the filter would also have false negatives, so if people want to slip NSFW though, they might be able to do it through the filter even without such option.

    But I don't mind the content staying hidden until a mod reviewed is in such cases. The false positive rate of the filter would likely be small, so there wouldn't be too many things that need review.

  • Yes, I know about the exploitation that happened during early industrialization, and it was horrible. But if people had just rejected and banned factories back then, we'd still be living in feudalism.

    I know that I don't want to work a job that can be easily automated, but intentionally isn't just so I can "have a purpose".

    What will happen if AI were to automate all jobs? In the most extreme case, where literally everyone lost their job, then nobody would be able to buy stuff, but also, no company would be able to sell products and make profit. Then, either capitalism would collapse - or more likely, it will adapt by implementing some mechanism such as UBI. Of course, the real effect of AI will not be quite that extreme, but it may well destabilize things.

    That said, if you want to change the system, it's exactly in periods of instability that can be done. So I'm not going to try to stop progress and cling to the status quo out of fear what those changes might be - and instead join a movement that tries to shape them.

    we should at least regulate the tech.

    Maybe. But generally on Lemmy I see sooo many articles about "Oh, no, AI bad". But no good suggestions on what exactly regulations should we want.

  • Not exactly. For example, you can't make the whole thing, GPL snippet included, available under MIT. You can only license your own contribution however you want (in addition to GPL).

  • That seems a somewhat contrived example. Yes, it can theoretically happen - but in practice it would happen with a library, and most libraries are LGPL (or more permissive) anyway. By contrast, there have been plenty of stories lately of people who wrote MIT/BSD software, and then got upset when companies just took the code to add in their products, without offering much support in return.

    Also, there's a certain irony in saying what essentially amounts to, "Please license your code more permissively, because I want to license mine more restrictively".