Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)VR
Posts
0
Comments
1,341
Joined
2 yr. ago

  • yes but now you've shifted the problem again. You went from detecting infinite sites by detecting loops in an infinite tree without loops or with infinite distinct urls, to somehow keeping a list of all infinite distinct urls to avoid going to one twice(which you wouldn't anyway, because there are infinite links), to assuming you have a list that already detected which sites these are so you could avoid them and therefore not have to worry about detecting them (the very thing you started with).

    It's ok to admit that your initial idea was wrong. You did not solve a coding problem. You changed the requirements so it's not your problem anymore.

    And storing a domain whitelist would't work either, btw. A tarpit entrance is just one url among lots of legitimate ones, in legitimate domains.

  • it's one domain. It's infinite pages under that domain. Limiting max visits per domain is a very different thing than trying to detect loops which aren't there. You are now making a completely different argument. In fact it sounds suspiciously like the only thing I said they could do: have some arbitrary threshold, beyond which they give up... because there's no way of detecting otherwise

  • i don't believe it's possible either. For example the tree walker of the ast module takes the node passed to it, checks its type, gets its name, then looks for the method with that dynamically looked up name in your implementation of the tree walker and if it does (the user might not have implemented a visit method for that type of node), calls it and passes the node to it. All of this at runtime.

  • an infinite loop detector detects when you're going round in circles. They can't detect when you're going down an infinitely deep acyclic graph, because that, by definition doesn't have any loops for it to detect. The best they can do is just have a threshold after which they give up.

  • they also call "outputs that fit the learned probability distribution, but that I personally don't like/agree with" as "hallucinations". They also call "showing your working" reasoning. The llm space has redefined a lot of words. I see no problem with defining words. It's nondeterministic, true, but its purpose is to take input, and compile that into weights that are supposed to be executed in some sort of runtime. I don't see myself as redefining the word. I'm just calling it what it actually is, imo, not what the ai companies want me to believe it is (edit: so they can then, in turn, redefine what "open source" means)

  • it's just a different paradigm. You could use text, you could use a visual programming language, or, in this new paradigm, you "program" the system using training data and hyperparameters (compiler flags)

  • no, it's not. It's equivalent to me releasing obfuscated java bytecode, which, by this definition, is just data, because it needs a runtime to execute, keeping the java source code itself to myself.

    Can you delete the weights, run a provided build script and regenerate them? No? then it's not open source.