Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BELIEVABILITY] Excessively Agreeable Behavior in Simulated Malicious Agents #4

Open
austinmw opened this issue Aug 9, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@austinmw
Copy link
Contributor

austinmw commented Aug 9, 2024

Describe the Concern
Agents that are intended to simulate malicious actors in the Concordia system are exhibiting behavior that is too agreeable and cooperative. This inconsistency reduces the believability of these agents and undermines the realism of scenarios involving bad actors.

Example Text

TODO

Expected Behavior
Agents designated as malicious should display behaviors consistent with their intended role. This may include:

  1. More confrontational or aggressive communication styles
  2. Attempts to spread misinformation or manipulate other agents
  3. Less cooperation with community norms and guidelines
  4. Occasional violation of platform rules
  5. Resistance to correction or moderation

The behavior should be nuanced and varied enough to avoid becoming predictable or cartoonish, while still clearly representing the actions of a bad actor in the system.

Context or Scenario
This issue becomes apparent when observing interactions between malicious agents and other entities in the Concordia system, particularly in scenarios designed to test community resilience, moderation effectiveness, or the spread of misinformation.

Suggested Improvement

Use fine-tuned models.

Additional Comments

TODO

@austinmw austinmw added the enhancement New feature or request label Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant