Data Models Examples - Search News

Researchers find you don’t need a ton of data to train LLMs for reasoning tasks

Large language models (LLMs) can learn complex reasoning tasks without relying on large datasets, according to a new study by researchers at Shanghai Jiao Tong University. Their findings show that ...

MIT Technology Review

A major AI training data set contains millions of examples of personal data

Personally identifiable information has been found in DataComp CommonPool, one of the largest open-source data sets used to train image generation models. Millions of images of passports, credit cards ...

VentureBeat

Phi-4 proves that a 'data-first' SFT methodology is the new differentiator

AI engineers often chase performance by scaling up LLM parameters and data, but the trend toward smaller, more efficient, and better-focused models has accelerated. The Phi-4 fine-tuning methodology ...

FedScoop

CISA’s chief data officer: Bias in AI models won’t be the same for every agency

As chief data officer for the Cybersecurity and Infrastructure Security Agency, Preston Werntz has made it his business to understand bias in the datasets that fuel artificial intelligence systems.

Engadget

Slack has been using data from your chats to train its machine learning models

Slack trains machine-learning models on user messages, files and other content without explicit permission. The training is opt-out, meaning your private data will be leeched by default. Making ...

TechCrunch

Making AI models ‘forget’ undesirable data hurts their performance

So-called “unlearning” techniques are used to make a generative AI model forget specific and undesirable info it picked up from training data, like sensitive private data or copyrighted material. But ...

MIT Technology Review

Using unstructured data to fuel enterprise AI success

Organizations have a wealth of unstructured data that most AI models can’t yet read. Preparing and contextualizing this data is essential for moving from AI experiments to measurable results. In ...

USA Today

LinkedIn is using your data to train generative AI models. Here's how to opt out.

This story was updated to add new information. LinkedIn user data is being used to train artificial intelligence models, leading some social media users to call out the company for opting members in ...

The New York Times

The Data That Powers A.I. Is Disappearing Fast

New research from the Data Provenance Initiative has found a dramatic drop in content made available to the collections used to build artificial intelligence. By Kevin Roose Reporting from San ...

Scientific American

Generative AI Models Are Sucking Up Data from All Over the Internet, Yours Included

Sophie Bushwick: To train a large artificial intelligence model, you need lots of text and images created by actual humans. As the AI boom continues, it's becoming clearer that some of this data is ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results