Knowledge Retrieval

Use the Knowledge Retrieval node to integrate existing knowledge bases into your workflows. The node searches specific knowledge for information relevant to queries and outputs results as contextual content for use in downstream nodes (e.g., LLMs). Below is an example of using the Knowledge Retrieval node in a Chatflow:

The User Input node collects the user query.
The Knowledge Retrieval node searches the selected knowledge base(s) for content related to the user query and outputs the retrieval results.
The LLM node generates a response based on both the user query and retrieved knowledge.
The Answer node returns the LLM’s response to the user.

Before using a Knowledge Retrieval node, ensure that you have at least one available knowledge base. To learn about creating knowledge bases, see Knowledge.

On Dify Cloud, knowledge retrieval operations are subject to rate limits based on the subscription plan. For more information, see Knowledge Request Rate Limit.

Configure a Knowledge Retrieval Node

To make the Knowledge Retrieval node work properly, you need to specify:

What it should search for (the query)
Where it should search (the knowledge base)
How to process the retrieval results (the node-level retrieval settings)

You can also use document metadata to enable filter-based searches and further improve retrieval precision.

Specify the Query

Provide the query content that the node should search for in the selected knowledge base(s).

Query Text: Select a text variable. For example, use userinput.query to reference user input in Chatflows, or a custom text-type user input variable in Workflows.
Query Images: Select an image variable, e.g., the image(s) uploaded by the user through a User Input node, to search by image. The image size limit is 2 MB.
For self-hosted deployments, you can adjust the image size limit via the environment variable ATTACHMENT_IMAGE_FILE_SIZE_LIMIT.

The Query Images option is available only when at least one multimodal knowledge base are added.Such knowledge bases are marked with the Vision tag, indicating that they are using a multimodal embedding model.

Select Knowledge to Search

Add one or more existing knowledge bases for the node to search for content relevant to the query. When multiple knowledge bases are added, knowledge is first retrieved from all of them simultaneously, then combined and processed according to the node-level retrieval settings.

Knowledge bases marked with the Vision tag support cross-modal retrieval—retrieving both text and images based on semantic relevance.

You can click the Edit icon next to any added knowledge base to modify its settings.

Configure Node-Level Retrieval Settings

Further fine-tune how the node processes retrieval results after they are fetched from the knowledge base(s).

There are two layers of retrieval settings—the knowledge base level and the knowledge retrieval node level.Think of them as two consecutive filters: the knowledge base settings determine the initial pool of results, and the node settings further rerank the results or narrow down the pool.

Rerank Settings
- Weighted Score: The relative weight between semantic similarity and keyword matching during reranking. Higher semantic weight favors meaning relevance, while higher keyword weight favors exact matches.
  Weighted Score is available only when all added knowledge bases are high-quality ones.
- Rerank Model: The rerank model to re-score and reorder all the results based on their relevance to the query.
  If any multimodal knowledge bases are added, select a multimodal rerank model (indicated by the Vision tag) as well. Otherwise, retrieved images will be excluded from reranking and the final output.
Top K: The maximum number of top results to return after reranking. When a rerank model is selected, this value will be automatically adjusted based on the model’s maximum input capacity (how much text the model can process at once).
Score Threshold: The minimum similarity score for returned results. Results scoring below this threshold are excluded. Use higher thresholds for stricter relevance or lower thresholds to include broader matches.

Enable Metadata Filtering

Use existing document metadata to restrict retrieval to specific documents within your knowledge base, improving retrieval precision. With metadata filtering enabled, the Knowledge Retrieval node only searches documents that match the specified metadata conditions, rather than searching across the entire knowledge base. This is especially useful for targeted searching in large and diverse knowledge bases.

Output

The Knowledge Retrieval node outputs the retrieval results as a variable named result, which is an array of retrieved document chunks containing their content, metadata, title, and other attributes. When the retrieval results contain image attachments, the result variable also includes a field named files containing image details.

Use with LLM Nodes

To use the retrieval results as context in an LLM node:

In Advanced Settings > Context, select the Knowledge Retrieval node’s result variable.
In the system instruction, reference the Context variable.
Optional: If the LLM is vision-capable, enable Vision so it can process image attachments in the retrieval results.
You don’t need to specify the retrieval results as the vision input. Once Vision is enabled, the LLM will automatically access any retrieved images.

In chatflows, citations are shown alongside responses that reference knowledge by default. You can turn this off by disabling Citation and Attributions in Features at the top right corner of the canvas.

​Configure a Knowledge Retrieval Node

​Specify the Query

​Select Knowledge to Search

​Configure Node-Level Retrieval Settings

​Enable Metadata Filtering

​Output

​Use with LLM Nodes

Configure a Knowledge Retrieval Node

Specify the Query

Select Knowledge to Search

Configure Node-Level Retrieval Settings

Enable Metadata Filtering

Output

Use with LLM Nodes