LLM Echo Chamber: personalized and automated disinformation
Lightning Talk
Recent advancements have significantly highlighted the capabilities of Large Language Models (LLMs) such as GPT-4 and Llama2 in performing diverse tasks including text summarization, translation, and content review. Despite their evident benefits, the implications of their widespread application warrant careful consideration. It has been underscored that the potential for these models to disseminate misinformation exists, which poses challenges in dealing with this more persuasive, faster-generated, human-liked misinformation. This concern is exacerbated by the capacity of LLMs to influence public opinion because of their wide usage. This study aims to critically examine the risks associated with LLMs, particularly their ability to disseminate misinformation on specific topics as factual and persuasively. To this end, we built the “LLM Echo Chamber”, a controlled digital environment designed to mimic the dynamics of social media platforms, specifically chatrooms, where misinformation often proliferates. The phenomenon of echo chambers is well-known - only interacting with those of the same opinions further reinforces a person’s beliefs and causes them to discard other viewpoints. The “LLM Echo Chamber” could help us study the effect of multiple malicious misinformation spreading bots in a chatroom, a common scenario for the internet Echo Chamber phenomenon. We first did a review of existing LLMs and their associated risks, LLM’s ability to spread misinformation, an exploration of state-of-the-art (SOTA) techniques for model finetuning, and some advanced methods for constructing interactive chatrooms. The model selection was based on the model’s performance, considerations of computing resources, and the level of safeguards. With Microsoft’s phi-2 model finetuned on the identity-shifting dataset we created, we could let the model generate harmful content. Subsequently, we developed an “LLM Echo Chamber” leveraging our finetuned model, frontend tools, and context-aware backend tools, employing specific prompt engineering and interactive logic to enhance the chatroom’s credibility. The efficacy of the chatroom was evaluated by automated evaluation based on GPT-4, which could provide us a comprehensive overview of the persuasiveness and harmfulness of our “LLM Echo Chamber”. Our findings contribute to the broader discourse on the ethical implications of LLMs and highlight the necessity for robust mechanisms to mitigate the potential dissemination of misinformation.