Reference: Hyaline Data Set

Overview

This documents the database schema of the Hyaline Change Set as generated and stored in sqlite. Note that both current and change data sets share the same schema.

Data Set Schema

Tables

The following tables make up a Hyaline Data Set.

SYSTEM

Systems within Hyaline.

ColumnTypeDescription
IDTEXTThe ID of the system, as defined in the config

Primary Key: ID

SYSTEM_CODE

System code sources within Hyaline.

ColumnTypeDescription
IDTEXTThe ID of the system code source, as defined in the config
SYSTEM_IDTEXTThe ID of the system this code source belongs to
PATHTEXTThe path of the code (see below for details)

Primary Key: (ID, SYSTEM_ID)

PATH

The path, or location, of the code being extracted as defined in the config. The value of the path depends on the type of extractor. Possible values are detailed below:

ExtractorValue
fsThe value of path in the extractor options
gitThe value of path in the extractor options if set, otherwise the value of repo in the extractor options

SYSTEM_FILE

Files belonging to system code sources within Hyaline.

ColumnTypeDescription
IDTEXTThe ID of the file. This will be the path to the file relative to the SYSTEM_CODE.PATH
CODE_IDTEXTThe ID of the system code source this file belongs to
SYSTEM_IDTEXTThe ID of the system this file belongs to
ACTIONTEXTOnly set when extracting changes. The git action associated with this file in the case that a change is extracted. See Enums > Action below for possible values
ORIGINAL_IDTEXTOnly set when extracting changes. The original file ID of this file if it was renamed
RAW_DATATEXTThe raw contents of this file

Primary Key: (ID, CODE_ID, SYSTEM_ID)

SYSTEM_DOCUMENTATION

System documentation sources within Hyaline.

ColumnTypeDescription
IDTEXTThe ID of the system documentation source, as defined in the config
SYSTEM_IDTEXTThe ID of the system this documentation source belongs to
TYPETEXTThe type of documentation. For possible values please see Enums > Documentation Type below
PATHTEXTThe path of the documentation (see below for details)

Primary Key: (ID, SYSTEM_ID)

PATH

The path, or location, of the documentation being extracted as defined in the config. The value of the path depends on the type of extractor. Possible values are detailed below:

ExtractorValue
fsThe value of path in the extractor options
gitThe value of path in the extractor options if set, otherwise the value of repo in the extractor options
httpThe parsed value of the baseUrl in the extractor options ({scheme}://{host}). Note that host includes port if set

SYSTEM_DOCUMENT

Documents belonging to system documentation sources within Hyaline.

ColumnTypeDescription
IDTEXTThe ID of the document. This will be the path to the document relative to the SYSTEM_DOCUMENTATION.PATH
DOCUMENTATION_IDTEXTThe ID of the system documentation source this document belongs to
SYSTEM_IDTEXTThe ID of the system this document belongs to
TYPETEXTThe type of document. For possible values please see Enums > Documentation Type below
ACTIONTEXTOnly set when extracting changes. The git action associated with this document in the case that a change is extracted. See Enums > Action below for possible values
ORIGINAL_IDTEXTOnly set when extracting changes. The original document ID of this document if it was renamed
RAW_DATATEXTThe raw contents of this document
EXTRACTED_DATATEXTThe data extracted from this document in markdown format. See Extract Current for how markdown is extracted from documents

Primary Key: (ID, DOCUMENTATION_ID, SYSTEM_ID)

SYSTEM_SECTION

Document sections extracted from documents belonging to system documentation sources within Hyaline.

ColumnTypeDescription
IDTEXTThe ID of the section. See below for how it is calculated
DOCUMENT_IDTEXTThe ID of the document this section belongs to
DOCUMENTATION_IDTEXTThe ID of the system documentation source this section belongs to
SYSTEM_IDTEXTThe ID of the system this section belongs to
NAMETEXTThe name of this section
PARENT_IDTEXTThe ID of the parent section (blank if no parent)
PEER_ORDERTEXT(Parsed as int)The order of this section amongst its peers (0 based)
EXTRACTED_DATATEXTThe data extracted from SYSTEM_DOCUMENT.EXTRACTED_DATA for this section. Note that this will contain the section contents including any child sections

Primary Key: (ID, DOCUMENT_ID, DOCUMENTATION_ID, SYSTEM_ID)

ID

The ID is constructed using the section’s title (including parent titles to the root of the document) separated by #. For example, the sub-section Foo (parent Bar) would have an ID of Bar#Foo. Note that any # symbols in the section titles are stripped, any pre or post whitespace is removed from the section title, and any internal whitespace in the title remains (for example My Section#My Subsection is valid).

SYSTEM_CHANGE

System changes, such as pull requests, for systems within Hyaline.

ColumnTypeDescription
IDTEXTThe ID of the change. See below for how it is calculated
SYSTEM_IDTEXTThe ID of the system this change belongs to
TYPETEXTThe type of change. See Enums > Change Type below for possible values
TITLETEXTThe title of the change
BODYTEXTThe contents of the change (in markdown)

Primary Key: (ID, SYSTEM_ID)

ID

The Change ID is constructed based on the Type of change. The format for each type is detailed below:

TypeFormat
GITHUB_PULL_REQUESTOWNER/REPO/ID

SYSTEM_TASK

ColumnTypeDescription
IDTEXTThe ID of the task. See below for how it is calculated
SYSTEM_IDTEXTThe ID of the system this task belongs to
TYPETEXTThe type of task. See Enums > Task Type below for possible values
TITLETEXTThe title of the task
BODYTEXTThe contents of the task (in markdown)

Primary Key: (ID, SYSTEM_ID)

Enums

The following enums exist and are used in Hyaline data sets

Action

ValueDescription
No action
InsertA file or document was inserted
ModifyA file or document was modified
RenameA file or document was renamed (may also have modifications)
DeleteA file or document was deleted

Documentation Type

ValueDescription
mdMarkdown
htmlHTML

Change Type

ValueDescription
GITHUB_PULL_REQUESTA GitHub Pull Request

Task Type

ValueDescription
GITHUB_ISSUEA GitHub Issue