coorsim

The coorsim R package is designed to detect and analyze coordinated social media manipulation (CSMM), allowing researchers to identify suspiciously similar patterns of social media behavior that may indicate coordinated efforts to spread content. By leveraging embeddings, coorsim detects similarities in posts that share themes and semantics, even if they use diverse vocabulary or languages. This approach is particularly relevant for identifying manipulation with AI-generated content.

Installation

You can install the development version of coorsim from GitHub with:

# install.packages("devtools")
devtools::install_github("thieled/coorsim")

Example

Below is an example of how to use coorsim to detect and analyze coordinated behavior in a set of Twitter data.

Step 1: Load Data

Prepare data with tweets containing posts and users with user metadata. Ensure a matrix of post embeddings is also available.

posts <- readRDS("/path/to/file")
users <- readRDS("/path/to/user_file")
post_embedding_matrix <- readRDS("/path/to/embedding_file")

Step 2: Detect Co-Similar Posts

Run co-similarity detection on posts within a 60-second timeframe and a cosine similarity threshold of 0.95.

sim_dt <- coorsim::detect_cosimilarity(
  data = posts,
  vector_matrix = post_embedding_matrix,
  time_window = 60,
  min_simil = 0.95,
  min_participation = 3,
  post_id = "tweet_id",
  account_id = "user_id",
  time = "created_at",
  content = "text",
  verbose = TRUE
)

Step 3: Detect Communities

Identify communities of accounts using the FSA_V method to reveal groups with coordinated posting behavior.

comm_dt <- coorsim::coorsim_detect_groups(
  simdt = sim_dt,
  user_data = users,
  cluster_method = "FSA_V",
  account_id = "user_id",
  theta = 0.7,
  verbose = TRUE
)

Step 4: Prepare for Community Labeling

Sample post content and metadata to generate concise community labels.

comm_dt <- coorsim::prepare_community_texts(
  groups_data = comm_dt,
  sample_n = 5,
  min_n_char = 10,
  verbose = TRUE
)

Step 5: Label Communities

Use a language model to generate labels for each identified community

instruction <- "Generate a concise label in English and a one-sentence description that summarizes the themes, tone, and regional focus of this community of Twitter users. The account names, locations, short bios, and sampled posts are provided below. Use '[LABEL:]' for the label and '[DESCRIPTION:]' for the description. Provide no additional output."

label_res <- coorsim::label_communities(
  groups_data = comm_dt,
  instruction = instruction,
  llm = "llama3.1:8b",
  retries = 3
)

Step 6: Visualize Community Network

p1 <- coorsim::plot_communities(network_data = label_res, component_size_threshold = 3)
p2 <- coorsim::plot_coordinated_posts(network_data = label_res, by_community = TRUE)

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
R		R
inst/python		inst/python
man		man
src		src
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
coorsim.Rproj		coorsim.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

coorsim

Installation

Example

Step 1: Load Data

Step 2: Detect Co-Similar Posts

Step 3: Detect Communities

Step 4: Prepare for Community Labeling

Step 5: Label Communities

Step 6: Visualize Community Network

About

Releases

Packages

Languages

thieled/coorsim

Folders and files

Latest commit

History

Repository files navigation

coorsim

Installation

Example

Step 1: Load Data

Step 2: Detect Co-Similar Posts

Step 3: Detect Communities

Step 4: Prepare for Community Labeling

Step 5: Label Communities

Step 6: Visualize Community Network

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages