Skip to content

octodemo/innersource-spider-mc

Repository files navigation

InnerSource Crawler

.github/workflows/linter.yml CodeQL

This project creates a repos.json that can be utilized by the SAP InnerSource Portal. The current approach assumes that the repos that you want to show in the portal are available in a GitHub organization, and that they all are tagged with a certain topic.

Support

If you need support using this project or have questions about it, please open up an issue in this repository. Requests made directly to GitHub staff or support team will be redirected here to open an issue. GitHub SLA's and support/services contracts do not apply to this repository.

Use as a GitHub Action

  1. Create a repository to host this GitHub Action or select an existing repository.
  2. Create the env values from the sample workflow below (GH_TOKEN, ORGANIZATION) with your information as repository secrets. More info on creating secrets can be found here. Note: Your GitHub token will need to have read/write access to all the repositories in the organization
  3. Copy the below example workflow to your repository and put it in the .github/workflows/ directory with the file extension .yml (ie. .github/workflows/crawler.yml)
  4. Don't forget to do something with the resulting repos.json file. You can move it to another repository if needed or save it as a build artifact. This will all depend on what you are doing with it and what repository you are running this action out of.

Example workflow

name: InnerSource repo crawler

on:
  workflow_dispatch:
  schedule:
    - cron: '00 5 * * *'

jobs:
  build:
    name: InnerSource repo crawler
    runs-on: ubuntu-latest

    steps:
    - name: Checkout code
      uses: actions/checkout@v2
    
    - name: Run crawler tool
      uses: docker://ghcr.io/zkoppert/innersource-crawler:v1
      env:
        GH_TOKEN: ${{ secrets.GH_TOKEN }}
        ORGANIZATION: ${{ secrets.ORGANIZATION }}
        TOPIC: inner-source

Local usage without Docker

  1. Copy .env-example to .env
  2. Fill out the .env file with a token from a user that has access to the organization to scan (listed below). Tokens should have admin:org or read:org access.
  3. Fill out the .env file with the exact topic name you are searching for
  4. Fill out the .env file with the exact organization that you want to search in
  5. (Optional) Fill out the .env file with the exact URL of the GitHub Enterprise that you want to search in. Keep empty if you want to search in the public github.com.
  6. pip install -r requirements.txt
  7. Run python3 ./crawler.py, which will create a repos.json file containing the relevant metadata for the GitHub repos for the given topic
  8. Copy repos.json to your instance of the SAP-InnerSource-Portal and launch the portal as outlined in their installation instructions

License

MIT