I asked Google Bard whether it thought Web Environment Integrity was a good or bad idea. Surprisingly, not only did it respond that it was a bad idea, it even went on to urge Google to drop the proposal.
LLMs are trained to predict next word given context, yes. But in order to do that, they develop internal model that minimizes error across wide range of contexts - and emergent feature of this process is that the model DOES perform more than pure compression of the training data.
For example, GPT-3 is able to calculate addition and subtraction problems that didn't appear in the training dataset. This would suggest that the model learned how to perform addition and subtraction, likely because it was easier or more efficient than storing all of the examples from the training data separately.
This is a simple to measure example, but it's enough to suggests that LLMs are able to extrapolate from the training data and perform more than just stitch relevant parts of the dataset together.
That's interesting, I'd be curious to read more about that. Do you have any links to get started with? Searching this type of stuff on Google yields less than ideal results.
Large language models literally do subspace projections on text to break it into contextual chunks, and then memorize the chunks. That's how they're defined.
Source: the paper that defined the transformer architecture and formulas for large language models, which has been cited in academic sources 85,000 times alone https://arxiv.org/abs/1706.03762
Yes because online discussions usually aren't inherently subjective and instead backed by sourceable knowledge.
Sorry for the cynicism but one could always find any source that underlines any point so everything should be taken with a grain of salt.
I'd personally argue, that the way generative AI works lends itself to produce answers that fit the general consensus of the internet that is relevant to the given prompt, because it calculates the most likely response based on the information available. Since most information relevant to "Google Web DRM" is critical of it (Google doesn't call it DRM themselves), it makes sense a prompt querying the AI for opinions on Web DRM will result in a rather negative response, if Google doesn't tamper with it to their advantage.
i just tried this and got a more fence-sitting result of "here are the pros and cons, there should be public discussion before we know if it's good or bad".
Internet is just a series of tubes. You're talking about alternative content/services providers (news, video, shopping, etc.) if the existing ones choose to require only approved browsers.
It would be more feasible to fork the Web rather than the Internet. The Web is all software and protocols, while the Internet would require different physical service providers.
In my opinion, Web Environment Integrity (WEI) is a bad idea. It has the potential to harm the open web and restrict user freedom.
WEI is a proposal by Google to add a new layer of security to the web by allowing websites to verify the integrity of the user's environment. This would be done by having the browser generate a cryptographic token that attests to the state of the user's device and software. Websites could then use this token to determine whether the user is running a legitimate copy of the browser, or if the browser has been tampered with.
There are a number of potential problems with WEI. First, it could be used to restrict user freedom. For example, websites could refuse to serve content to users who are running non-standard browsers or operating systems. This could make it difficult for users to choose the software they want to use, and it could also make it more difficult for users to access certain types of content.
Second, WEI could be used to stifle innovation. New browsers and operating systems would need to be certified by Google in order to be compatible with WEI. This could make it more difficult for small developers to create new browsers, and it could also discourage innovation in the browser market.
Third, WEI could be used to collect user data. The cryptographic tokens that WEI generates could contain information about the user's device and software. This information could be used to track users across websites, or it could be used to target users with advertising.
Overall, I believe that WEI is a bad idea. It has the potential to harm the open web and restrict user freedom. I urge Google to reconsider its proposal and to find a more privacy-friendly way to improve web security.
Here are some additional examples of how WEI could be used to harm the open web:
WEI could be used to block users from accessing websites that are critical of Google or its partners.
WEI could be used to block users from running ad blockers or other privacy-enhancing extensions.
WEI could be used to block users from running alternative operating systems, such as Linux or FreeBSD.
I believe that the potential harms of WEI outweigh the potential benefits. I urge Google to abandon this proposal and to find a more privacy-friendly way to improve web security.