A flexible Node.js tool for scanning GitHub repositories from organizations, topics, and specific repos, aggregating their data into JSON format.
- π¦ Scan entire GitHub organizations
- π·οΈ Search repositories by topic
- π― Fetch specific repositories by name
- π Automatic deduplication
- π Timestamped output files
- βοΈ Fully configurable via JSON
- Clone this repository:
git clone <your-repo-url>
cd ExtensionsScanner- Install dependencies:
npm install- Set up your GitHub token:
cp .env.example .env- Edit
.envand add your GitHub Personal Access Token:
GITHUB_TOKEN=your_github_token_here
- Go to GitHub Settings > Tokens
- Click "Generate new token (classic)"
- Select scopes:
public_repo(for public repositories)repo(if you need private repositories)read:user(optional, for user data)
- Copy the token and paste it in your
.envfile
Edit config/config.json to customize what gets scanned:
{
"organizations": [
"gemini-cli-extensions"
],
"topics": [
"gemini-cli-extensions"
],
"repositories": [
"owner/repo-name"
],
"output": {
"filename": "gemini-extensions.json"
}
}- organizations: Array of GitHub organization names to scan
- topics: Array of topics to search for
- repositories: Array of specific repositories in
owner/repoformat - output.filename: Name of the output file (base name without timestamp)
Run the scanner:
npm startOr use the scan command:
npm run scanOr execute directly:
./src/index.jsThe scanner generates two files in the output/ directory:
- Timestamped file:
<filename>_YYYY-MM-DDTHH-MM-SS.json- Historical record - Latest file:
<filename>.json- Always contains the most recent scan
Each repository in the JSON includes:
{
"url": "https://github.com/owner/repo",
"description": "Repository description",
"stars": 42,
"lastUpdated": "2024-01-15T10:30:00Z"
}{
"organizations": ["microsoft"],
"topics": [],
"repositories": [],
"output": {
"filename": "microsoft-repos.json"
}
}{
"organizations": [],
"topics": ["nodejs", "typescript", "ai-tools"],
"repositories": [],
"output": {
"filename": "awesome-repos.json"
}
}{
"organizations": ["vercel"],
"topics": ["nextjs"],
"repositories": ["facebook/react", "vuejs/vue"],
"output": {
"filename": "frontend-frameworks.json"
}
}ExtensionsScanner/
βββ src/
β βββ api/
β β βββ github.js # GitHub API client
β βββ scanners/
β β βββ organizationScanner.js # Organization repo scanner
β β βββ topicScanner.js # Topic search scanner
β β βββ repositoryScanner.js # Specific repo scanner
β βββ utils/
β β βββ deduplicator.js # Deduplication logic
β β βββ formatter.js # Data formatting
β β βββ fileWriter.js # JSON file writer
β βββ index.js # Main entry point
βββ output/ # Generated JSON files
βββ config/
β βββ config.json # Scanner configuration
βββ .env # Environment variables (not committed)
βββ .env.example # Environment template
βββ package.json