Public

WeChat Login

Code Issues Pull requests Events Packages Insights

main

knowledge-base-plugin/README.en.md

AcidBunny

Use isort to sort imports in the project files.

08d99e44

PreviewCode viewBlame

Raw

Knowledge Base Plugin

Use this plugin to import repository documents into CNB's knowledge base to support search or LLM RAG Q&A.

Currently supports Markdown, mdx, pdf, docx, txt files. File types are determined solely by their extensions.

Knowledge Preparation

Understanding the RAG Application Flowchart

Plugin Usage

As shown in the diagram above, the usage process consists of two steps:

1. 📚 Data Preparation Stage

Use this knowledge base plugin to import repository documents into CNB's knowledge base. The plugin runs in CNB's cloud-native build environment, automatically processes documents, and constructs the knowledge base. Once the knowledge base is built, it can be used by downstream LLM applications.

2. 💻 LLM Application Development

After the knowledge base is built, use CNB's Open API for retrieval and combine with LLM to generate responses.

A typical RAG application workflow is as follows:

1.User asks a question

2.Question understanding

3.Call knowledge base retrieval, using CNB's Open API as mentioned above to fetch relevant documents

4.Construct combined prompt: question + knowledge context. For example, the combined prompt usually looks like:


User: {user_question}

Knowledge Base:
{knowledge_content}

Please answer the user's question based on the above knowledge base.

5.Send the combined prompt to the LLM model to generate an answer and return it to the user.

Plugin Details

Plugin Image Name

cnbcool/knowledge-base

Parameter Description

Document Processing Parameters

include: Specifies files to include, using glob pattern matching. Default is * (include all files). Supports array or comma-separated values.
exclude: Specifies files to exclude, using glob pattern matching. By default, no files are excluded. Supports array or comma-separated values.
force_rebuild: Whether to delete and recreat the knowledge base. Default is false. When set to true, the knowledge base will be deleted and recreated.
chunk_size: Specifies text chunk size. Default is 1500.
chunk_overlap: Specifies the number of overlapping tokens between adjacent chunks. Default is 0.
ignore_process_failures: Whether to ignore document processing failures. Default is false. When set to true, the knowledge base will be updated even if some files fail to process.
issue_sync_enabled: Whether to enable Issue synchronization. Default is false. When enabled, it automatically pulls repository Issue data and adds it to the knowledge base.
issue_state: Only sync Issues with specified state. By default includes all Issues. Possible values: open|closed.
issue_labels: Only sync Issues with specified labels. By default includes all Issues. Supports array or comma-separated values.
embedding_model: Embedding model. Default is hunyuan. Currently only supports hunyuan.

Note: exclude takes precedence over include. Files excluded by exclude will not be included even if matched by include.

Using the Plugin in CNB

Basic Usage Example


main:
  push:
    - stages:
        - name: build knowledge base
          image: cnbcool/knowledge-base
          settings:
            include: 
              - docs/*.md
              - docs/*.txt

Example with Document Processing and Issue Sync Enabled


main:
  push:
    - stages:
        - name: build knowledge base
          image: cnbcool/knowledge-base
          settings:
            include:
              - docs/*.md
              - docs/*.txt            
            issue_sync_enabled: true
            issue_labels:
              - bug
              - feature
            issue_state: "open"
            issue_priority: "P0,P1"

Using CNB's Open API for Knowledge Base Retrieval

This API is used to query the knowledge base, returning relevant information based on provided query keywords.

Before starting, please read: CNB Open API Tutorial Access token requires permission: repo-code:r (read repository code)

API Information

URL: https://api.cnb.cool/{slug}/-/knowledge/base/query
Method: POST
Content Type: application/json

Note: {slug} should be replaced with the repository slug. For example, the CNB official documentation knowledge base repository address is https://cnb.cool/cnb/docs, then {slug} is cnb/docs

Request Parameters

The request body should be in JSON format with the following fields:

Parameter	Type	Required	Description
query	string	Yes	Keywords or question to query
top_k	number	No	Maximum number of results to return. Default is 5
score_threshold	number	No	Relevance score threshold. Default is 0

Request Example


{
    "query": "Cloud-native development configuration for custom buttons"
}

Response Content

The response is in JSON format, containing an array of results. Each result includes the following fields:

Field	Type	Description
score	number	Relevance score, range 0-1
chunk	string	Matched knowledge base text
metadata	object	Content metadata

Metadata Field Details

Field	Type	Description
hash	string	Unique content hash
name	string	Document name
path	string	Document path
position	number	Position in original document
score	number	Relevance score
type	string	Content type, e.g., "code", "issue"
url	string	Content URL

Response Example


[
    {
        "score": 0.8671732,
        "chunk": "This cloud-native remote development solution is based on Docker...",
        "metadata": {
            "hash": "15f7a1fc4420cbe9d81a946c9fc88814",
            "name": "quick-start",
            "path": "vscode/quick-start.md",
            "position": 0,
            "score": 0.8671732
        }
    }
]

Usage Example

cURL Request Example

Note: {slug} is the repository slug where the knowledge base plugin runs. For example, the CNB official documentation knowledge base is cnb/docs


curl -X "POST" "https://api.cnb.cool/{slug}/-/knowledge/base/query" \
  -H "accept: application/json" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${token}" \
  -d '{
    "query": "Cloud-native development configuration for custom buttons"
}'

Notes

The chunk field in the response contains the matched knowledge base content fragment in Markdown format.
Higher score values indicate better matches.
The slug in the URL is the repository slug where the knowledge base plugin runs.

35/F,Tencent Building,Kejizhongyi Avenue,Nanshan District,Shenzhen

京ICP备11018762号-111