How To: Extract Documentation
Purpose
Configure Hyaline to extract documentation using the Hyaline GitHub App
Prerequisite(s)
- Install GitHub App
- Have one or more documentation sources to be extracted (e.g. git repo, documentation website, etc…)
Steps
1. Create Configuration
The first step is to create a configuration file for the documentation source in the appropriate folder in the forked hyaline-github-app-config
repo.
For example, the configuration to extract documentation from a repository should be places in the repos/
folder named <repo-name>.yml
and should look something like:
llm:
provider: ${HYALINE_LLM_PROVIDER}
model: ${HYALINE_LLM_MODEL}
key: ${HYALINE_LLM_TOKEN}
github:
token: ${HYALINE_GITHUB_TOKEN}
extract:
source:
id: <documentation source id>
description: <documentation source description>
crawler:
type: git
options:
repo: https://github.com/<owner>/<repo>.git
branch: main
clone: true
auth:
type: http
options:
username: git
password: ${HYALINE_GITHUB_TOKEN}
include:
- "**/*.md"
extractors:
- type: md
include:
- "**/*.md"
Configuration to extract documentation from a documentation site should be placed in sites/
and the crawler/extractors should be configured as needed (see the
configuration reference for more information).
2. Run Doctor
Run the Doctor
workflow in the forked hyaline-github-app-config
repo to 1) ensure that the configuration is valid and 2) to add the repository or site to the list of available extraction targets. Merge the resulting PR if needed.
3. Run Extract
Run the Extract Repo/Site
workflow in the forked hyaline-github-app-config
repo to trigger an extraction. Note that you can trigger a merge of this documentation into the current documentation data set by leaving the Trigger Merge Workflow
option enabled.
Next Steps
Read more about how extraction works or visit the configuration reference.