DEV Community

Cover image for How to Deploy LangChain App πŸ¦πŸ”— as an API
Akshay Ballal
Akshay Ballal

Posted on • Edited on

How to Deploy LangChain App πŸ¦πŸ”— as an API

Introduction

Recently, I had an idea to add my own AI Assistant to my website - www.akshaymakes.com. I wanted it to answer questions about my projects and myself and talk a lot about machine learning and artificial intelligence in general. My website is built on SvelteKit and I could have chosen to use OpenAI’s API directly from the front end. But, I wanted the assistant to be scalable. That is, in the future, I wanted it to be able to browse through my blogs, projects, and other material on my website and answer questions much better. So for this purpose and to keep the website simple, I created a LangChain application using FastAPI which would integrate with my website with REST API. I wanted the experience to be similar to ChatGPT i.e. it should be able to remember the context of the conversation and continue the conversation naturally. I deployed the application on Deta Space which was quite simple to do. Let me take you step by step through the process. We will keep the application simple right now. In future blogs, I will explain how you can add your website as context using Vector Databases like Weaviate or Pinecone to make the chat assistant more knowledgeable about you.

So in this tutorial I will show you how to create an API for getting OpenAI outputs using LangChain, FastAPI and Deta Space. Let’s begin.

Setting Up

  1. Start with a new Python project in a new directory. In our example, let us call the project directory as LangChainAPI.

  2. Create a directory in LangChainAPI called app and new file .env

  3. Inside the app folder, create an empty __init__.py file and a new main.py and conversation.py file.

  4. In this new directory, initiate a virtual environment with the following terminal command.

    python -m venv venv
    
  5. This is how the project structure will look like

    
    β”œβ”€β”€ appβ”‚ 
    β”‚   β”œβ”€β”€ __init__.py
    β”‚   β”œβ”€β”€ main.py
    β”‚   β”œβ”€β”€ conversation.py
    β”œβ”€β”€ .venv| 
    └── .gitignore
    └── .env
    
    
  6. Activate the environment.

    For Windows

    venv\Scripts\activate.bat
    

    For MacOS/Linux

    source venv/bin/activate
    
  7. Install the dependencies.

    pip install langchain fastapi "uvicorn[standard]" openai python-dotenv
    
  8. Install Deta Space CLI

    For Windows

    iwr <https://deta.space/assets/space-cli.ps1> -useb | iex
    

    For MacOS/Linux

    curl -fsSL get.deta.dev/space-cli.sh | sh
    
  9. Initiate Git Repository and commit

    git init
    git add .
    git commit -m "First Commit"
    
  10. Create an account on https://deta.space/signup and get your access token from the settings.

    Access Token

  11. Login to Deta Space in the CLI. It will ask for the access token. Paste it.

    space login
    

That is all for the setup. Now let us create the API.

Creating the API

In the app folder open conversation.py. This is where we will write the logic for LangChain.

    from langchain import OpenAI, ConversationChain, LLMChain, PromptTemplate

    load_dotenv()

    def conversation(human_input):
        template = """Assistant is a large language model trained by  OpenAI.

        Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

        Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.

        Overall, Assistant is a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.

        {history}
        Human: {human_input}
        Assistant:"""

        prompt = PromptTemplate(
            input_variables=["history", "human_input"],
            template=template
        )

        chatgpt_chain = LLMChain(
            llm=OpenAI(temperature=0),
            prompt=prompt,
            verbose=True,
        )

        output = chatgpt_chain.predict(human_input=human_input)
        return output


Enter fullscreen mode Exit fullscreen mode

In the main.py file.

from fastapi import FastAPI
from langcorn import create_service
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from app.conversation import conversation

class Input(BaseModel):
    human_input: str

class Output(BaseModel):
    output: str

app=FastAPI()

@app.post("/conversation")
async def input(input: Input):
    output = Output(output=conversation(input.human_input))
    return output

origins = [
    "<http://localhost>",
    "<http://localhost:5173>",
        "...Your Domains..."
]

app.add_middleware(
    CORSMiddleware,
    allow_origins=origins,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)
Enter fullscreen mode Exit fullscreen mode

In the origins you can add other domains that will be making requests to your api.

Run the API Server Locally

In the terminal use this command to start the terminal locally. It will start on localhost:8000.

uvicorn app.main:app --reload
Enter fullscreen mode Exit fullscreen mode

To test your API, go to localhost:8000/docs in your browser. This should open Swagger Docs.

Swagger

You can input your prompt and check if you are getting the response. Once this works, feel free to play around with LangChain. You can add memory, context etc. At this moment, Deta Space does not support local Vector Databases. So you will have to use remote Vector Databases if you need to save your context files and embeddings.

Deploy to Deta Space

Once you are happy with the API, commit the changes to git

git add .
git commit -m "API Works"
Enter fullscreen mode Exit fullscreen mode

Initiate Deta Space:

space new
Enter fullscreen mode Exit fullscreen mode

This will create a new SpaceFile in the project. Open this file and make and overwrite this to it.

# Spacefile Docs: <https://go.deta.dev/docs/spacefile/v0>
v: 0
micros:
  - name: LangChainAPI
    src: ./
    engine: python3.9
    primary: true
    run: uvicorn app.main:app
    presets: 
      env: 
        - name: OPENAI_API_KEY
          description: Secret message only available to this Micro
          default: "OpenAPI Key"
      api_keys: true
Enter fullscreen mode Exit fullscreen mode

Save the file and run this command in the terminal.

space push
Enter fullscreen mode Exit fullscreen mode

This will create an instance of your API on the Deta Space Dashboard. In this case, it is named β€œgpt_server”. In your case, it will be β€œLangChainAPI”.

Go to the instance settings and add your OpenAI API Key from the β€œConfigurations” tab. Then, go to the domains tab and get the base URL of your API. You can test it out in the browser first using the Swagger Docs and then use it in your application as REST API.

Deta Demo


Want to connect?

🌍My Website

🐦My Twitter

πŸ‘¨My LinkedIn

Top comments (4)

Collapse
 
japes12345 profile image
japes12345

the deta cli doesn't install - I've never heard of them - does this work with vercel?

Collapse
 
akshayballal profile image
Akshay Ballal

Hey there. I hope @wh Chan has answered your problem with the deta cli. Regarding vercel.... Vercel has a much lower response time before the serverless function times out. Its 20 seconds on Vercel. So for longer chat responses, you will timeout. You need to limit your response tokens. I havent faced this issue with deta yet.

Collapse
 
indiepaywall profile image
WH.C

if you are using MacOS, the cmd is wrong, it should be

curl -fsSL get.deta.dev/space-cli.sh | sh

deta.space/docs/en/build/fundament...

Collapse
 
akshayballal profile image
Akshay Ballal

Thanks a lot for correcting the mistake. I will update it.