Технологии

The site allows you to find books that was pirated by Meta for training Llama AI


In January 2025, during the trial, the process it turned out that Mark Zuckerberg’s Meta for Llama AI training has illegally used millions of books. Now you can find out which ones.

Large AI language models require huge sets of textual data to train and reproduce the exact match of words in a language. In fact, legal original materials for artificial intelligence are becoming hard to find.

«We are literally running out of text in the universe on which to train these systems», — said computer scientist Stuart Russell back in 2023

Meta, the parent company of Facebook and Instagram, has been forced to lift the veil on how this is actually done. A court case revealed the fact that Meta had illegally downloaded the well-known pirate library LibGen to obtain millions of legally protected texts. After that, the programmers received approval from Zuckerberg himself, the books were illegally transferred for LLM studies. Thus, one of the largest companies in the world did not pay for a single copy of these books.

Website The Atlantic has created a search engine that allows you to identify which books have been used by Meta in LibGen files. This is an extremely large amount of data, covering more than 7.5 million books, about 81 million scientific papers, and other works.

The lawsuit was led by authors Ta-Nehisi Coates and Sarah Silverman, who had insight into Meta’s data piracy through a previous lawsuit in 2023. The new search tool allows writers and scientists to see which work has been «ripped off» by a corporation to train commercial AI.

«My book is here — and that’s a good thing! LibGen makes texts available to people who otherwise wouldn’t have access. The problem isn’t that LibGen makes content available for free, it’s that Meta steals that content for profit», — says Wired author Justin Ling.

The final decision in the ongoing process is not expected until the summer. In the meantime, Llama is up and running and free on platforms such as Facebook, Instagram, and WhatsApp. This is not the only such lawsuit against a major corporation: a year ago the authors sued NVIDIA.

Source: Futurism



Source link

Related posts

Ученые изобрели новый способ хранения данных

admin

Сооснователь Meteora продвигает «паразитический» мемоин

admin

Forza Horizon 5 for PS5 will require a Microsoft account even for single player

admin

Leave a Comment

Этот сайт использует файлы cookie для улучшения вашего опыта. Мы будем считать, что вы согласны с этим, но вы можете отказаться, если хотите. Принять Подробнее