logo
32
9
WeChat Login

Knowledge Base Plugin

Use this plugin to import repository documents into CNB's knowledge base to support search or LLM RAG Q&A.

Currently supports Markdown, mdx, pdf, docx, txt files. File types are determined solely by their extensions.

Knowledge Preparation

Understanding the RAG Application Flowchart

Plugin Usage

As shown in the diagram above, the usage process consists of two steps:

1. 📚 Data Preparation Stage

Use this knowledge base plugin to import repository documents into CNB's knowledge base. The plugin runs in CNB's cloud-native build environment, automatically processes documents, and constructs the knowledge base. Once the knowledge base is built, it can be used by downstream LLM applications.

2. 💻 LLM Application Development

After the knowledge base is built, use CNB's Open API for retrieval and combine with LLM to generate responses.

A typical RAG application workflow is as follows:

1.User asks a question

2.Question understanding

3.Call knowledge base retrieval, using CNB's Open API as mentioned above to fetch relevant documents

4.Construct combined prompt: question + knowledge context. For example, the combined prompt usually looks like:

User: {user_question} Knowledge Base: {knowledge_content} Please answer the user's question based on the above knowledge base.

5.Send the combined prompt to the LLM model to generate an answer and return it to the user.

Plugin Details

Plugin Image Name

cnbcool/knowledge-base

Parameter Description

Document Processing Parameters

  • include: Specifies files to include, using glob pattern matching. Default is * (include all files). Supports array or comma-separated values.
  • exclude: Specifies files to exclude, using glob pattern matching. By default, no files are excluded. Supports array or comma-separated values.
  • force_rebuild: Whether to delete and recreat the knowledge base. Default is false. When set to true, the knowledge base will be deleted and recreated.
  • chunk_size: Specifies text chunk size. Default is 1500.
  • chunk_overlap: Specifies the number of overlapping tokens between adjacent chunks. Default is 0.
  • ignore_process_failures: Whether to ignore document processing failures. Default is false. When set to true, the knowledge base will be updated even if some files fail to process.
  • issue_sync_enabled: Whether to enable Issue synchronization. Default is false. When enabled, it automatically pulls repository Issue data and adds it to the knowledge base.
  • issue_state: Only sync Issues with specified state. By default includes all Issues. Possible values: open|closed.
  • issue_labels: Only sync Issues with specified labels. By default includes all Issues. Supports array or comma-separated values.
  • embedding_model: Embedding model. Default is hunyuan. Currently only supports hunyuan.

Note: exclude takes precedence over include. Files excluded by exclude will not be included even if matched by include.

Using the Plugin in CNB

Basic Usage Example

main: push: - stages: - name: build knowledge base image: cnbcool/knowledge-base settings: include: - docs/*.md - docs/*.txt

Example with Document Processing and Issue Sync Enabled

main: push: - stages: - name: build knowledge base image: cnbcool/knowledge-base settings: include: - docs/*.md - docs/*.txt issue_sync_enabled: true issue_labels: - bug - feature issue_state: "open" issue_priority: "P0,P1"

Using CNB's Open API for Knowledge Base Retrieval

This API is used to query the knowledge base, returning relevant information based on provided query keywords.

Before starting, please read: CNB Open API Tutorial Access token requires permission: repo-code:r (read repository code)

API Information

  • URL: https://api.cnb.cool/{slug}/-/knowledge/base/query
  • Method: POST
  • Content Type: application/json

Note: {slug} should be replaced with the repository slug. For example, the CNB official documentation knowledge base repository address is https://cnb.cool/cnb/docs, then {slug} is cnb/docs

Request Parameters

The request body should be in JSON format with the following fields:

ParameterTypeRequiredDescription
querystringYesKeywords or question to query
top_knumberNoMaximum number of results to return. Default is 5
score_thresholdnumberNoRelevance score threshold. Default is 0

Request Example

{ "query": "Cloud-native development configuration for custom buttons" }

Response Content

The response is in JSON format, containing an array of results. Each result includes the following fields:

FieldTypeDescription
scorenumberRelevance score, range 0-1
chunkstringMatched knowledge base text
metadataobjectContent metadata

Metadata Field Details

FieldTypeDescription
hashstringUnique content hash
namestringDocument name
pathstringDocument path
positionnumberPosition in original document
scorenumberRelevance score
typestringContent type, e.g., "code", "issue"
urlstringContent URL

Response Example

[ { "score": 0.8671732, "chunk": "This cloud-native remote development solution is based on Docker...", "metadata": { "hash": "15f7a1fc4420cbe9d81a946c9fc88814", "name": "quick-start", "path": "vscode/quick-start.md", "position": 0, "score": 0.8671732 } } ]

Usage Example

cURL Request Example

Note: {slug} is the repository slug where the knowledge base plugin runs. For example, the CNB official documentation knowledge base is cnb/docs

curl -X "POST" "https://api.cnb.cool/{slug}/-/knowledge/base/query" \ -H "accept: application/json" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer ${token}" \ -d '{ "query": "Cloud-native development configuration for custom buttons" }'

Notes

  1. The chunk field in the response contains the matched knowledge base content fragment in Markdown format.
  2. Higher score values indicate better matches.
  3. The slug in the URL is the repository slug where the knowledge base plugin runs.