Documentation
¶
Overview ¶
Package github fetches pull request data from GitHub using prx or turnserver.
Index ¶
- func CalculateActualTimeWindow(prs []PRSummary, requestedDays int) (actualDays int, hitLimit bool)
- func CountBotPRs(prs []PRSummary) int
- func CountOpenPRsInOrg(ctx context.Context, org, token string) (int, error)
- func CountOpenPRsInRepo(ctx context.Context, owner, repo, token string) (int, error)
- func CountUniqueAuthors(prs []PRSummary) int
- func FetchPRData(ctx context.Context, prURL string, token string, updatedAt time.Time) (cost.PRData, error)
- func FetchPRDataViaTurnserver(ctx context.Context, prURL string, token string, updatedAt time.Time) (cost.PRData, error)
- func IsBot(author string) bool
- func PRDataFromPRX(prData *prx.PullRequestData) cost.PRData
- type PRDataWithAnalysis
- type PRSummary
- type ProgressCallback
- type SimpleFetcher
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CalculateActualTimeWindow ¶ added in v0.7.0
CalculateActualTimeWindow validates time coverage for the fetched PRs. With the multi-query approach, we fetch PRs to cover the full requested period. This function logs coverage statistics but always returns the requested period.
Parameters:
- prs: List of PRs fetched (may be from multiple queries)
- requestedDays: Number of days originally requested
Returns:
- actualDays: Always returns requestedDays (multi-query ensures coverage)
- hitLimit: Always returns false (no period adjustment needed)
func CountBotPRs ¶ added in v0.8.0
CountBotPRs counts how many PRs in the list are authored by bots.
func CountOpenPRsInOrg ¶ added in v0.7.0
CountOpenPRsInOrg counts all open PRs across an entire GitHub organization with a single GraphQL query. This is much more efficient than counting PRs repo-by-repo for organizations with many repositories. Only counts PRs created more than 24 hours ago to exclude brand-new PRs.
func CountOpenPRsInRepo ¶ added in v0.7.0
CountOpenPRsInRepo queries GitHub GraphQL API to get the total count of open PRs in a repository that were created more than 24 hours ago (PRs open <24 hours don't count as tracking overhead yet).
Parameters:
- ctx: Context for the API call
- owner: GitHub repository owner
- repo: GitHub repository name
- token: GitHub authentication token
Returns:
- count: Number of open PRs created >24 hours ago
func CountUniqueAuthors ¶ added in v0.7.0
CountUniqueAuthors counts the number of unique authors in a slice of PRSummary. Bot authors are excluded from the count.
func FetchPRData ¶
func FetchPRData(ctx context.Context, prURL string, token string, updatedAt time.Time) (cost.PRData, error)
FetchPRData retrieves pull request information from GitHub and converts it to the format needed for cost calculation.
Uses prx's CacheClient for disk-based caching with automatic cleanup.
The updatedAt parameter enables effective caching. Pass the PR's updatedAt timestamp from GraphQL queries, or time.Now() for fresh data.
Parameters:
- ctx: Context for the API call
- prURL: Full GitHub PR URL (e.g., "https://github.com/owner/repo/pull/123")
- token: GitHub authentication token
- updatedAt: PR's last update timestamp (for caching) or time.Now() to bypass cache
Returns:
- cost.PRData with all information needed for cost calculation
func FetchPRDataViaTurnserver ¶ added in v0.7.0
func FetchPRDataViaTurnserver(ctx context.Context, prURL string, token string, updatedAt time.Time) (cost.PRData, error)
FetchPRDataViaTurnserver retrieves pull request information from the turnserver and converts it to the format needed for cost calculation.
The turnserver aggregates PR data and analysis, and includes full event history when requested. This is more efficient than calling GitHub API directly for complete PR data.
The updatedAt parameter enables effective caching on the turnserver side. Pass the PR's updatedAt timestamp from GraphQL queries, or time.Now() for fresh data.
Parameters:
- ctx: Context for the API call
- prURL: Full GitHub PR URL (e.g., "https://github.com/owner/repo/pull/123")
- token: GitHub authentication token
- updatedAt: PR's last update timestamp (for caching) or time.Now() to bypass cache
Returns:
- cost.PRData with all information needed for cost calculation
func PRDataFromPRX ¶
func PRDataFromPRX(prData *prx.PullRequestData) cost.PRData
PRDataFromPRX converts prx.PullRequestData to cost.PRData. This allows you to use prcost with pre-fetched PR data.
Parameters:
- prData: PullRequestData from prx package
Returns:
- cost.PRData with all information needed for cost calculation
Types ¶
type PRDataWithAnalysis ¶ added in v0.8.0
PRDataWithAnalysis combines PR data with turnserver analysis.
func FetchPRDataWithAnalysisViaTurnserver ¶ added in v0.8.0
func FetchPRDataWithAnalysisViaTurnserver(ctx context.Context, prURL string, token string, updatedAt time.Time) (PRDataWithAnalysis, error)
FetchPRDataWithAnalysisViaTurnserver retrieves pull request information and analysis from the turnserver. This includes both the PR data needed for cost calculation and the workflow analysis (seconds_in_state, workflow_state, etc.).
Parameters:
- ctx: Context for the API call
- prURL: Full GitHub PR URL (e.g., "https://github.com/owner/repo/pull/123")
- token: GitHub authentication token
- updatedAt: PR's last update timestamp (for caching) or time.Now() to bypass cache
Returns:
- PRDataWithAnalysis containing both cost.PRData and turn.Analysis
type PRSummary ¶ added in v0.7.0
PRSummary holds minimal information about a PR for sampling and fetching.
func FetchPRsFromOrg ¶ added in v0.7.0
func FetchPRsFromOrg(ctx context.Context, org string, since time.Time, token string, progress ProgressCallback) ([]PRSummary, error)
FetchPRsFromOrg queries GitHub GraphQL Search API for all PRs across an organization modified since the specified date.
Uses an adaptive multi-query strategy for comprehensive time coverage:
- Query recent activity (updated desc) - get up to 1000 PRs
- If hit limit, query old activity (updated asc) - get ~500 more
- Check gap between oldest "recent" and newest "old"
- If gap > 1 week, query early period (created asc) - get ~250 more
Parameters:
- ctx: Context for the API call
- org: GitHub organization name
- since: Only include PRs updated after this time
- token: GitHub authentication token
- progress: Optional callback for progress updates (can be nil)
Returns:
- Slice of PRSummary for all matching PRs (deduplicated)
func FetchPRsFromRepo ¶ added in v0.7.0
func FetchPRsFromRepo(ctx context.Context, owner, repo string, since time.Time, token string, progress ProgressCallback) ([]PRSummary, error)
FetchPRsFromRepo queries GitHub GraphQL API for all PRs in a repository modified since the specified date.
Uses an adaptive multi-query strategy for comprehensive time coverage:
- Query recent activity (updated DESC) - get up to 1000 PRs
- If hit limit, query old activity (updated ASC) - get ~500 more
- Check gap between oldest "recent" and newest "old"
- If gap > 1 week, query early period (created ASC) - get ~250 more
Parameters:
- ctx: Context for the API call
- owner: GitHub repository owner
- repo: GitHub repository name
- since: Only include PRs updated after this time
- token: GitHub authentication token
- progress: Optional callback for progress updates (can be nil)
Returns:
- Slice of PRSummary for all matching PRs (deduplicated)
func SamplePRs ¶ added in v0.7.0
SamplePRs uses a time-bucket strategy to evenly sample PRs across the time range. This ensures samples are distributed throughout the period rather than clustered. Bot-authored PRs are excluded from sampling.
Parameters:
- prs: List of PRs to sample from
- sampleSize: Desired number of samples
Returns:
- Slice of sampled PRs (may be smaller than sampleSize if insufficient PRs)
Strategy:
- Includes both human and bot-authored PRs
- Divides time range into buckets equal to sampleSize
- Selects most recent PR from each bucket
- If buckets are empty, fills with nearest unused PRs
type ProgressCallback ¶ added in v0.8.0
ProgressCallback is called during PR fetching to report progress. Parameters: queryName (e.g., "recent", "old", "early"), currentPage, totalPRsSoFar.
type SimpleFetcher ¶ added in v0.8.0
SimpleFetcher is a PRFetcher that fetches PR data without caching. It uses either prx or turnserver based on configuration.