Connect with us

Technology

Many AI model safety assessments have significant limitations

Published

on

Many safety evaluations for AI models have significant limitations

Despite the growing demand for AI security and accountability, today’s tests and benchmarks will not be enough, a brand new report finds.

Generative AI models—models that may analyze and generate text, images, music, video, and more—are coming under increasing scrutiny for his or her tendency to make mistakes and usually behave unpredictably. Now, organizations from public sector agencies to big tech firms are proposing recent benchmarks to check the safety of those models.

At the tip of last yr, the startup Scale AI created lab dedicated to assessing how well models adhere to security guidelines. This month, NIST and the U.K. AI Safety Institute released tools designed to evaluate model risk.

However, these tests and model testing methods could also be insufficient.

The Ada Lovelace Institute (ALI), a British non-profit organization dedicated to artificial intelligence research, conducted test who interviewed experts from academic, civil society, and vendor modeling labs and examined recent research on AI security assessments. The co-authors found that while current assessments will be useful, they should not comprehensive, will be easily fooled, and don’t necessarily provide guidance on how models will perform in real-world scenarios.

“Whether it’s a smartphone, a prescription drug, or a car, we expect the products we use to be safe and reliable; in these sectors, products are rigorously tested to ensure they’re safe before being deployed,” Elliot Jones, a senior researcher at ALI and co-author of the report, told TechCrunch. “Our research aimed to examine the limitations of current approaches to assessing AI safety, assess how assessments are currently being used, and explore their use as a tool for policymakers and regulators.”

Benchmarks and red teaming

The study’s co-authors first surveyed the tutorial literature to determine an summary of the harms and risks that current models pose and the state of existing assessments of AI models. They then interviewed 16 experts, including 4 employees of unnamed technology firms developing generative AI systems.

The study revealed that there’s wide disagreement across the AI ​​industry on the perfect set of methods and taxonomies for evaluating models.

Some evaluations only tested how well the models matched benchmarks within the lab, not how the models might impact real-world users. Others were based on tests designed for research purposes, not on evaluating production models—yet vendors insisted on using them in production.

We’ve written before concerning the problems with AI benchmarking. This study highlights all of those issues and more.

Experts cited within the study noted that it’s hard to extrapolate a model’s performance from benchmark results, and it’s unclear whether benchmarks may even show that a model has a certain capability. For example, while a model may perform well on a state exam, that doesn’t mean it can have the ability to resolve more open legal challenges.

Experts also pointed to the issue of knowledge contamination, where benchmark results can overstate a model’s performance if it was trained on the identical data it’s being tested on. Benchmarks, in lots of cases, are chosen by organizations not because they’re the perfect assessment tools, but due to their convenience and ease of use, experts said.

“Benchmarks run the risk of being manipulated by developers who may train models on the same dataset that will be used to evaluate the model, which is equivalent to looking at an exam paper before an exam or strategically choosing which assessments to use,” Mahi Hardalupas, a researcher at ALI and co-author of the study, told TechCrunch. “Which version of the model is being evaluated also matters. Small changes can cause unpredictable changes in behavior and can override built-in safety features.”

The ALI study also found problems with “red-teaming,” the practice of getting individuals or groups “attack” a model to discover gaps and flaws. Many firms use red-teaming to judge models, including AI startups OpenAI and Anthropic, but there are few agreed-upon standards for red-teaming, making it difficult to evaluate the effectiveness of a given effort.

Experts told the study’s co-authors that finding individuals with the correct skills and experience to steer red teaming efforts will be difficult, and the manual nature of the method makes it expensive and labor-intensive, a barrier for smaller organizations that don’t have the mandatory resources.

Possible solutions

The foremost the reason why AI rankings have not improved are the pressure to release models faster and the reluctance to run tests that might cause issues before launch.

“The person we spoke to who works for a foundation modeling company felt that there is more pressure within companies to release models quickly, which makes it harder to push back and take assessments seriously,” Jones said. “The major AI labs are releasing models at a speed that outpaces their ability or society’s ability to ensure they are safe and reliable.”

One ALI survey respondent called evaluating models for safety an “intractable” problem. So what hopes does the industry—and those that regulate it—have for solutions?

Mahi Hardalupas, a researcher at ALI, believes there’s a way forward, but it can require greater commitment from public sector entities.

“Regulators and policymakers need to be clear about what they expect from ratings,” he said. “At the same time, the ratings community needs to be transparent about the current limitations and potential of ratings.”

Hardalupas suggests that governments mandate greater public participation in the event of assessments and implement measures to support an “ecosystem” of third-party testing, including programs to offer regular access to any required models and datasets.

Jones believes it could be mandatory to develop “context-aware” assessments that transcend simply testing a model’s response to a command, and as an alternative consider the sorts of users a model might affect (akin to people of a certain background, gender, or ethnicity), in addition to the ways wherein attacks on models could bypass security measures.

“This will require investment in fundamental evaluation science to develop more robust and repeatable evaluations based on an understanding of how the AI ​​model works,” she added.

However, there’s never a guarantee that a model is protected.

“As others have noted, ‘safety’ is not a property of models,” Hardalupas said. “Determining whether a model is ‘safe’ requires understanding the contexts in which it is used, to whom it is sold or shared, and whether the safeguards that are implemented are appropriate and robust to mitigate those risks. Baseline model assessments can serve exploratory purposes to identify potential risks, but they cannot guarantee that the model is safe, much less ‘completely safe.’ Many of our interviewees agreed that assessments cannot prove that a model is safe and can only indicate that the model is unsafe.”

This article was originally published on : techcrunch.com
Continue Reading
Advertisement
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Technology

US medical device giant Artivion says hackers stole files during a cybersecurity incident

Published

on

By

Artivion, a medical device company that produces implantable tissue for heart and vascular transplants, says its services have been “disrupted” resulting from a cybersecurity incident.

In 8-K filing In an interview with the SEC on Monday, Georgia-based Artivion, formerly CryoLife, said it became aware of a “cybersecurity incident” that involved the “compromise and encryption” of information on November 21. This suggests that the corporate was attacked by ransomware, but Artivion has not yet confirmed the character of the incident and didn’t immediately reply to TechCrunch’s questions. No major ransomware group has yet claimed responsibility for the attack.

Artivion said it took some systems offline in response to the cyberattack, which the corporate said caused “disruptions to certain ordering and shipping processes.”

Artivion, which reported third-quarter revenue of $95.8 million, said it didn’t expect the incident to have a material impact on the corporate’s funds.

This article was originally published on : techcrunch.com
Continue Reading

Technology

It’s a Raspberry Pi 5 in a keyboard and it’s called Raspberry Pi 500

Published

on

By

Manufacturer of single-board computers Raspberry Pi is updating its cute little computer keyboard device with higher specs. Named Raspberry Pi500This successor to the Raspberry Pi 400 is just as powerful as the present Raspberry Pi flagship, the Raspberry Pi 5. It is on the market for purchase now from Raspberry Pi resellers.

The Raspberry Pi 500 is the simplest method to start with the Raspberry Pi because it’s not as intimidating because the Raspberry Pi 5. When you take a look at the Raspberry Pi 500, you do not see any chipsets or PCBs (printed circuit boards). The Raspberry Pi is totally hidden in the familiar housing, the keyboard.

The idea with the Raspberry Pi 500 is you could connect a mouse and a display and you are able to go. If, for instance, you’ve got a relative who uses a very outdated computer with an outdated version of Windows, the Raspberry Pi 500 can easily replace the old PC tower for many computing tasks.

More importantly, this device brings us back to the roots of the Raspberry Pi. Raspberry Pi computers were originally intended for educational applications. Over time, technology enthusiasts and industrial customers began using single-board computers all over the place. (For example, when you’ve ever been to London Heathrow Airport, all of the departures and arrivals boards are there powered by Raspberry Pi.)

Raspberry Pi 500 draws inspiration from the roots of the Raspberry Pi Foundation, a non-profit organization. It’s the right first computer for college. In some ways, it’s a lot better than a Chromebook or iPad because it’s low cost and highly customizable, which inspires creative pondering.

The Raspberry Pi 500 comes with a 32GB SD card that comes pre-installed with Raspberry Pi OS, a Debian-based Linux distribution. It costs $90, which is a slight ($20) price increase over the Raspberry Pi 400.

Only UK and US keyboard variants will probably be available at launch. But versions with French, German, Italian, Japanese, Nordic and Spanish keyboard layouts will probably be available soon. And when you’re in search of a bundle that features all the things you would like, Raspberry Pi also offers a $120 desktop kit that features the Raspberry Pi 500, a mouse, a 27W USB-C power adapter, and a micro-HDMI to HDMI cable.

In other news, Raspberry Pi has announced one other recent thing: the Raspberry Pi monitor. It is a 15.6-inch 1080p monitor that’s priced at $100. Since there are quite a few 1080p portable monitors available on the market, this launch is not as noteworthy because the Pi 500. However, for die-hard Pi fans, there’s now also a Raspberry Pi-branded monitor option available.

Image credits:Raspberry Pi

This article was originally published on : techcrunch.com
Continue Reading

Technology

Apple Vision Pro may add support for PlayStation VR controllers

Published

on

By

Vision Pro headset

According to Apple, Apple desires to make its Vision Pro mixed reality device more attractive for gamers and game developers latest report from Bloomberg’s Mark Gurman.

The Vision Pro was presented more as a productivity and media consumption device than a tool geared toward gamers, due partly to its reliance on visual and hand controls moderately than a separate controller.

However, Apple may need gamers if it desires to expand the Vision Pro’s audience, especially since Gurman reports that lower than half one million units have been sold to this point. As such, the corporate has reportedly been in talks with Sony about adding support for PlayStation VR2 handheld controllers, and has also talked to developers about whether they may support the controllers of their games.

Offering more precise control, Apple may also make other forms of software available in Vision Pro, reminiscent of Final Cut Pro or Adobe Photoshop.

This article was originally published on : techcrunch.com
Continue Reading
Advertisement

OUR NEWSLETTER

Subscribe Us To Receive Our Latest News Directly In Your Inbox!

We don’t spam! Read our privacy policy for more info.

Trending