Text watermarking is a technique for embedding hidden information within textual content to verify its authenticity, origin, or ownership. With the rise of generative AI systems using large language models (LLM), there has been significant development focused on watermarking AI-generated text. Potential applications include detecting fake news and academic cheating, and excluding AI-generated material from LLM training data. With LLMs the focus is on linguistic approaches that involve selecting words to form patterns within the text that can later be identified. The results of the first reported large-scale public deployment, a trial using Google's Gemini chatbot, appeared in October 2024: users across 20 million responses found watermarked and unwatermarked text to be of equal quality. Research on text watermarking began in 1997.
See also
References
- ^ Kamaruddin, Nurul Shamimi; Kamsin, Amirrudin; Por, Lip Yee; Rahman, Hameedur (2018). "A Review of Text Watermarking: Theory, Methods, and Applications". IEEE Access. 6: 8011–8028. Bibcode:2018IEEEA...6.8011K. doi:10.1109/ACCESS.2018.2796585. ISSN 2169-3536.
- Liu, Aiwei; Pan, Leyi; Lu, Yijian; Li, Jingjing; Hu, Xuming; Zhang, Xi; Wen, Lijie; King, Irwin; Xiong, Hui; Yu, Philip (2024-09-03). "A Survey of Text Watermarking in the Era of Large Language Models". ACM Computing Surveys. 57 (2): 1–36. arXiv:2312.07913. doi:10.1145/3691626. ISSN 0360-0300.
- ^ Gibney, Elizabeth (Oct 23, 2024). "Google unveils invisible 'watermark' for AI-generated text". Nature. 634 (8036): 1027–1028. Bibcode:2024Natur.634.1027G. doi:10.1038/d41586-024-03462-7. PMID 39443774. Retrieved Oct 26, 2024.