This paper explores the legal and ethical labyrinth of language model training: revealing risks and solutions in data set transparency and use 12/26/2023