Detecting and Mitigating Harmful Content
Research on identifying and preventing harmful or malicious content generated by or input to LLMs
This presentation covers 102 research papers on large language models applied to Detecting and Mitigating Harmful Content.