OpenAI Contractors Asked to Upload Past Work for AI Training

3

OpenAI, the company behind ChatGPT, is reportedly requesting third-party contractors to submit genuine work samples from their previous and current employment. This practice, revealed in a Wired report, raises questions about intellectual property and data security within the rapidly evolving AI industry.

The Data-Hungry AI Industry

The move appears to be part of a broader trend among AI developers. These firms are increasingly relying on contractors to generate high-quality training data, with the ultimate goal of automating more white-collar tasks. The logic is straightforward: better training data leads to more capable AI models. OpenAI’s internal presentation, as described in the report, explicitly asks contractors to provide “real, on-the-job work” examples—including documents, spreadsheets, images, and even code repositories.

Risks and Caveats

While OpenAI instructs contractors to scrub confidential and personal information before uploading, legal experts warn that this approach is inherently risky.

“Any AI lab taking this approach is putting itself at great risk,” says intellectual property lawyer Evan Brown. “It requires a lot of trust in contractors to decide what is and isn’t confidential.”

The company even provides access to a ChatGPT-powered tool, dubbed “Superstar Scrubbing,” to aid in data sanitization. However, the reliance on contractors’ self-policing raises concerns about potential leaks of proprietary or sensitive information. OpenAI declined to comment on the matter.

Why This Matters

This practice highlights the intense pressure AI companies face to acquire high-quality training data. As models become more sophisticated, the demand for real-world examples—rather than synthetic or publicly available datasets—will likely increase. The ethical and legal implications of this approach remain unclear, particularly regarding worker rights, intellectual property ownership, and data privacy.

The reliance on contractors also underscores the hidden labor behind AI development. While much of the conversation centers on the technology itself, the human effort required to train these models often goes unacknowledged.

In the end, OpenAI’s actions reveal a pragmatic yet potentially reckless strategy: leveraging human work to fuel AI progress, even if it means navigating murky legal and ethical territory.