Use this plugin to import repository documents into CNB's knowledge base to support search or LLM RAG Q&A.
Currently supports Markdown, mdx, pdf, docx, txt files. File types are determined solely by their extensions.
As shown in the diagram above, the usage process consists of two steps:
Use this knowledge base plugin to import repository documents into CNB's knowledge base. The plugin runs in CNB's cloud-native build environment, automatically processes documents, and constructs the knowledge base. Once the knowledge base is built, it can be used by downstream LLM applications.
After the knowledge base is built, use CNB's Open API for retrieval and combine with LLM to generate responses.
A typical RAG application workflow is as follows:
1.User asks a question
2.Question understanding
3.Call knowledge base retrieval, using CNB's Open API as mentioned above to fetch relevant documents
4.Construct combined prompt: question + knowledge context. For example, the combined prompt usually looks like:
User: {user_question} Knowledge Base: {knowledge_content} Please answer the user's question based on the above knowledge base.
5.Send the combined prompt to the LLM model to generate an answer and return it to the user.
cnbcool/knowledge-base
include: Specifies files to include, using glob pattern matching. Default is * (include all files). Supports array or comma-separated values.exclude: Specifies files to exclude, using glob pattern matching. By default, no files are excluded. Supports array or comma-separated values.force_rebuild: Whether to delete and recreat the knowledge base. Default is false. When set to true, the knowledge base will be deleted and recreated.chunk_size: Specifies text chunk size. Default is 1500.chunk_overlap: Specifies the number of overlapping tokens between adjacent chunks. Default is 0.ignore_process_failures: Whether to ignore document processing failures. Default is false. When set to true, the knowledge base will be updated even if some files fail to process.issue_sync_enabled: Whether to enable Issue synchronization. Default is false. When enabled, it automatically pulls repository Issue data and adds it to the knowledge base.issue_state: Only sync Issues with specified state. By default includes all Issues. Possible values: open|closed.issue_labels: Only sync Issues with specified labels. By default includes all Issues. Supports array or comma-separated values.embedding_model: Embedding model. Default is hunyuan. Currently only supports hunyuan.Note:
excludetakes precedence overinclude. Files excluded byexcludewill not be included even if matched byinclude.
main:
push:
- stages:
- name: build knowledge base
image: cnbcool/knowledge-base
settings:
include:
- docs/*.md
- docs/*.txt
main:
push:
- stages:
- name: build knowledge base
image: cnbcool/knowledge-base
settings:
include:
- docs/*.md
- docs/*.txt
issue_sync_enabled: true
issue_labels:
- bug
- feature
issue_state: "open"
issue_priority: "P0,P1"
This API is used to query the knowledge base, returning relevant information based on provided query keywords.
Before starting, please read: CNB Open API Tutorial Access token requires permission:
repo-code:r(read repository code)
https://api.cnb.cool/{slug}/-/knowledge/base/queryNote:
{slug}should be replaced with the repository slug. For example, the CNB official documentation knowledge base repository address ishttps://cnb.cool/cnb/docs, then{slug}iscnb/docs
The request body should be in JSON format with the following fields:
| Parameter | Type | Required | Description |
|---|---|---|---|
| query | string | Yes | Keywords or question to query |
| top_k | number | No | Maximum number of results to return. Default is 5 |
| score_threshold | number | No | Relevance score threshold. Default is 0 |
{
"query": "Cloud-native development configuration for custom buttons"
}
The response is in JSON format, containing an array of results. Each result includes the following fields:
| Field | Type | Description |
|---|---|---|
| score | number | Relevance score, range 0-1 |
| chunk | string | Matched knowledge base text |
| metadata | object | Content metadata |
| Field | Type | Description |
|---|---|---|
| hash | string | Unique content hash |
| name | string | Document name |
| path | string | Document path |
| position | number | Position in original document |
| score | number | Relevance score |
| type | string | Content type, e.g., "code", "issue" |
| url | string | Content URL |
[
{
"score": 0.8671732,
"chunk": "This cloud-native remote development solution is based on Docker...",
"metadata": {
"hash": "15f7a1fc4420cbe9d81a946c9fc88814",
"name": "quick-start",
"path": "vscode/quick-start.md",
"position": 0,
"score": 0.8671732
}
}
]
Note: {slug} is the repository slug where the knowledge base plugin runs. For example, the CNB official documentation knowledge base is cnb/docs
curl -X "POST" "https://api.cnb.cool/{slug}/-/knowledge/base/query" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${token}" \
-d '{
"query": "Cloud-native development configuration for custom buttons"
}'
chunk field in the response contains the matched knowledge base content fragment in Markdown format.score values indicate better matches.slug in the URL is the repository slug where the knowledge base plugin runs.