Harnessing Dify + Local LLMs on AMD Ryzen™ AI PCs for Private Workflows

The video demonstrates how to integrate Lemonade Server with Dify AI on an AMD Ryzen AI PC to run local large language models and create context-aware chatbots using Retrieval-Augmented Generation. It guides viewers through installing Dify via Docker, configuring local models, importing knowledge bases, and building customized agentic workflows, all within a user-friendly interface in under five minutes.

In this video, the presenter demonstrates how to seamlessly integrate Lemonade Server with Dify AI to create agentic workflows using their user-friendly interface, while running large language models (LLMs) locally on an AMD Ryzen AI PC. This integration enables the addition of Retrieval-Augmented Generation (RAG) for context-aware LLMs, enhancing the chatbot’s ability to provide relevant and accurate responses based on specific knowledge bases. Viewers are encouraged to watch a previous video for detailed instructions on installing and using Lemonade Server.

The installation process for Dify begins by cloning the GitHub repository via the terminal, navigating to the Docker folder, and copying the environment example file to a new configuration file. Using Docker, the Dify image is pulled and launched as a container. Once the container is running, users access Dify through a provided URL, create an account if it’s their first time, and log in. Within the Dify interface, the Lemonade plugin is installed from the Marketplace to enable integration between the two platforms.

After installing Lemonade, the next step is configuring the language models. This is done by accessing the Settings menu, selecting Model Provider, and adding models such as Llama 3.2 with 3 billion parameters. The model runs locally on the Ryzen AI PC, utilizing both the Neural Processing Unit (NPU) and integrated GPU for efficient performance. Users input the model’s endpoint URL, set the recommended context size, and enable agent thought support. It is important to ensure that the models are downloaded beforehand via Lemonade’s Model Manager.

With Dify configured to work with Lemonade, the video then guides viewers through building a context-aware chatbot. This involves importing relevant documents, such as FAQs and README files, into the Knowledge tab to create a knowledge base. The default chunking settings are used initially to process the documents. In the Studio section, a new chatflow is created from scratch, and knowledge retrieval is added to provide the LLM access to the imported context. The LLM node is configured to use the selected model, the knowledge base, and a system prompt instructing the chatbot to be helpful, knowledgeable, and to avoid fabricating information.

Finally, the presenter tests the chatbot by asking a question that requires the imported knowledge to answer correctly, demonstrating the effectiveness of the setup. The entire process, from installation to creating a customized chatbot, is completed in under five minutes. Viewers are encouraged to try integrating Lemonade with their own applications or creating new workflows in Dify, and to share their experiences via the provided contact email. The video concludes with a call to like and subscribe for future content related to AMD Ryzen AI PCs and AI workflows.