Skip Navigation
TechNews @radiation.party irradiated @radiation.party
BOT

[VERGE] OpenAI’s flagship AI model has gotten more trustworthy but easier to trick

www.theverge.com OpenAI’s flagship AI model has gotten more trustworthy but easier to trick

Microsoft researcher found GPT-4 can sometimes give biased results.

OpenAI’s flagship AI model has gotten more trustworthy but easier to trick

[ sourced from The Verge ]

1
1 comments
  • This is the best summary I could come up with:


    OpenAI’s GPT-4 large language model may be more trustworthy than GPT-3.5 but also more vulnerable to jailbreaking and bias, according to research backed by Microsoft.

    However, it could also be told to ignore security measures and leak personal information and conversation histories.

    The team says these vulnerabilities were tested for and not found in consumer-facing GPT-4-based products — basically, the majority of Microsoft’s products now — because “finished AI applications apply a range of mitigation approaches to address potential harms that may occur at the model level of the technology.”

    “Our goal is to encourage others in the research community to utilize and build upon this work, potentially pre-empting nefarious actions by adversaries who would exploit vulnerabilities to cause harm,” the team said.

    AI models like GPT-4 often go through red teaming, where developers test several prompts to see if they will spit out unwanted results.

    The FTC has since begun investigating OpenAI for potential consumer harm, such as publishing false information.


    The original article contains 398 words, the summary contains 162 words. Saved 59%. I'm a bot and I'm open source!