As chief data officer for the Cybersecurity and Infrastructure Security Agency, Preston Werntz has made it his business to understand bias in the datasets that fuel artificial intelligence systems.
VentureBeat and other experts have argued that open-source large language models (LLMs) may have a more powerful impact on generative AI in the enterprise. More powerful, that is, than closed models, ...
So-called “unlearning” techniques are used to make a generative AI model forget specific and undesirable info it picked up from training data, like sensitive private data or copyrighted material. But ...
AI engineers often chase performance by scaling up LLM parameters and data, but the trend toward smaller, more efficient, and better-focused models has accelerated. The Phi-4 fine-tuning methodology ...
Is it possible for an AI to be trained just on data generated by another AI? It might sound like a harebrained idea. But it’s one that’s been around for quite some time — and as new, real data is ...
Organizations have a wealth of unstructured data that most AI models can’t yet read. Preparing and contextualizing this data is essential for moving from AI experiments to measurable results. In ...
From boardroom bedlam to courtroom drama, Sam Altman has had a tumultuous three months. In December, the New York Times filed a federal lawsuit against OpenAI, alleging that the company infringed on ...
Sophie Bushwick: To train a large artificial intelligence model, you need lots of text and images created by actual humans. As the AI boom continues, it's becoming clearer that some of this data is ...