GitHubProfilesRepositories Contract
The GitHubProfilesRepositories contract demonstrates how to fetch GitHub profile data, analyze repository counts, and store profiles of high-contributing developers. This contract shows how to use the comparative equivalence principle along with pattern matching to process web content.
# { "Depends": "py-genlayer:test" }
from genlayer import *
import typing
import re
@gl.contract
class GitHubProfilesRepositories:
github_profiles: DynArray[str]
def __init__(self):
pass
@gl.public.write
def store_high_contributors_github_profile(self, github_handle: str) -> typing.Any:
github_profile_url = "https://github.com/"+github_handle
def fetch_github_profile_repositories() -> int:
profile_web_page = gl.get_webpage(github_profile_url, mode="text")
# Regular expression to find the number between "Repositories" and "Projects"
pattern = r"Repositories\s+(\d+)\s+Projects"
# Search for the pattern
match = re.search(pattern, profile_web_page)
# Extract the number if found
if match:
return int(match.group(1)) # Group 1 contains the captured number
else:
return 0
repositories = gl.eq_principle_strict_eq(fetch_github_profile_repositories)
if repositories > 25:
self.github_profiles.append(github_handle)
@gl.public.view
def show_github_profiles(self) -> str:
return [profile for profile in self.github_profiles]
Code Explanation
- Initialization: The
GitHubProfilesRepositories
class initializes with an empty dynamic array to store the GitHub handles of the high-contributing developers. - Write Method:
store_high_contributors_github_profile(github_handle)
analyzes a GitHub profile's repository count.- Uses regular expressions to extract the repository count from the profile page.
- Stores profiles with more than 25 repositories.
- Read Method:
show_github_profiles()
returns the list of stored high-contributor profiles.
Key Components
- GitHub Integration: Uses
gl.get_webpage()
to fetch profile content. - Pattern Matching: Employs regular expressions to extract repository counts.
- Deterministic Execution: Uses
gl.eq_principle_strict_eq()
to ensure network consensus. - Conditional Storage: Only stores profiles meeting specific criteria.
Deploying the Contract
To deploy the GitHubProfilesRepositories contract:
- Deploy the Contract: No initial parameters are needed for deployment.
- The contract will initialize with an empty profiles array.
Checking the Contract State
After deployment, you can:
- Use
show_github_profiles()
to view the list of stored high-contributor profiles. - Initially, this will return an empty list.
Executing Transactions
To interact with the deployed contract:
- Call
store_high_contributors_github_profile(github_handle)
with a GitHub username. - The function will:
- Fetch the profile page content
- Extract the repository count
- Store the handle if repositories > 25
- Make the profile list available through
show_github_profiles()
Understanding Data Processing
This contract demonstrates several important concepts:
- Web Scraping: Shows how to extract specific data from web pages.
- Regular Expressions: Demonstrates pattern matching in web content.
- Conditional Logic: Implements criteria-based storage decisions.
- Dynamic Arrays: Shows how to maintain a growing list of data.
Handling Different Scenarios
- Initial State: The profiles list starts empty.
- High Contributors: Profiles with >25 repositories are added to the list.
- Low Contributors: Profiles with ≤25 repositories are not stored.
- Invalid Profiles: Returns 0 repositories and doesn't store the profile.
Important Notes
- This example focuses on public GitHub profiles only.
- Repository counts may change over time.
- The regular expression pattern assumes specific GitHub page structure.
- The threshold of 25 repositories is arbitrary and can be adjusted.
Security Considerations
- Be aware that GitHub's page structure might change.
- Consider implementing error handling for malformed profiles.
- Respect GitHub's rate limits and terms of service.
- Validate input GitHub handles before processing.
Performance Optimization
- The contract only stores handles, not full profile content.
- Regular expression pattern is optimized for specific data extraction.
- Conditional storage prevents unnecessary state bloat.
- Dynamic array allows for efficient list management.
You can monitor the contract's behavior through transaction logs, which will show the repository counts and profile additions as they occur.