Skip to content

LyalinDotCom/ExtensionsScanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Repository Scanner

A flexible Node.js tool for scanning GitHub repositories from organizations, topics, and specific repos, aggregating their data into JSON format.

Features

  • πŸ“¦ Scan entire GitHub organizations
  • 🏷️ Search repositories by topic
  • 🎯 Fetch specific repositories by name
  • πŸ”„ Automatic deduplication
  • πŸ“… Timestamped output files
  • βš™οΈ Fully configurable via JSON

Installation

  1. Clone this repository:
git clone <your-repo-url>
cd ExtensionsScanner
  1. Install dependencies:
npm install
  1. Set up your GitHub token:
cp .env.example .env
  1. Edit .env and add your GitHub Personal Access Token:
GITHUB_TOKEN=your_github_token_here

Creating a GitHub Token

  1. Go to GitHub Settings > Tokens
  2. Click "Generate new token (classic)"
  3. Select scopes:
    • public_repo (for public repositories)
    • repo (if you need private repositories)
    • read:user (optional, for user data)
  4. Copy the token and paste it in your .env file

Configuration

Edit config/config.json to customize what gets scanned:

{
  "organizations": [
    "gemini-cli-extensions"
  ],
  "topics": [
    "gemini-cli-extensions"
  ],
  "repositories": [
    "owner/repo-name"
  ],
  "output": {
    "filename": "gemini-extensions.json"
  }
}

Configuration Options

  • organizations: Array of GitHub organization names to scan
  • topics: Array of topics to search for
  • repositories: Array of specific repositories in owner/repo format
  • output.filename: Name of the output file (base name without timestamp)

Usage

Run the scanner:

npm start

Or use the scan command:

npm run scan

Or execute directly:

./src/index.js

Output

The scanner generates two files in the output/ directory:

  1. Timestamped file: <filename>_YYYY-MM-DDTHH-MM-SS.json - Historical record
  2. Latest file: <filename>.json - Always contains the most recent scan

Output Format

Each repository in the JSON includes:

{
  "url": "https://github.com/owner/repo",
  "description": "Repository description",
  "stars": 42,
  "lastUpdated": "2024-01-15T10:30:00Z"
}

Examples

Scan a single organization

{
  "organizations": ["microsoft"],
  "topics": [],
  "repositories": [],
  "output": {
    "filename": "microsoft-repos.json"
  }
}

Scan multiple topics

{
  "organizations": [],
  "topics": ["nodejs", "typescript", "ai-tools"],
  "repositories": [],
  "output": {
    "filename": "awesome-repos.json"
  }
}

Mix and match

{
  "organizations": ["vercel"],
  "topics": ["nextjs"],
  "repositories": ["facebook/react", "vuejs/vue"],
  "output": {
    "filename": "frontend-frameworks.json"
  }
}

Project Structure

ExtensionsScanner/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   └── github.js              # GitHub API client
β”‚   β”œβ”€β”€ scanners/
β”‚   β”‚   β”œβ”€β”€ organizationScanner.js # Organization repo scanner
β”‚   β”‚   β”œβ”€β”€ topicScanner.js        # Topic search scanner
β”‚   β”‚   └── repositoryScanner.js   # Specific repo scanner
β”‚   β”œβ”€β”€ utils/
β”‚   β”‚   β”œβ”€β”€ deduplicator.js        # Deduplication logic
β”‚   β”‚   β”œβ”€β”€ formatter.js           # Data formatting
β”‚   β”‚   └── fileWriter.js          # JSON file writer
β”‚   └── index.js                   # Main entry point
β”œβ”€β”€ output/                         # Generated JSON files
β”œβ”€β”€ config/
β”‚   └── config.json                # Scanner configuration
β”œβ”€β”€ .env                           # Environment variables (not committed)
β”œβ”€β”€ .env.example                   # Environment template
└── package.json

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published