Orquesta provides your product teams with no-code collaboration tooling to experiment, operate, and monitor LLMs and remote configurations within your SaaS. As an LLMOps engineer, using Orquesta, you can easily perform prompt engineering, prompt management, LLMOps, experimentation in production, push new versions directly to production, and have full observability and monitoring.
LangChain is a framework for developing applications powered by large language models. It enables applications that are data-aware to connect a language model to other sources of data, and it allows a language model to interact with its environment.
In this article, you will learn how to integrate Orquesta and LangChain. We will explain how to set a prompt in Orquesta, and request it from LangChain to predict an output. All this is possible with the help of the Orquesta Python SDK and can be implemented in a few easy steps.
Prerequisites
For you to be able to follow along in this tutorial, you will need the following:
Jupyter Notebook (or any IDE of your choice).
Orquesta Python SDK.
Step 1 - Install SDK and create a client instance
You can easily install the Python SDK and Cohere via the Python package installer pip
.
pip install orquesta-sdk
pip install langchain
This will install the Orquesta SDK and LangChain on your local machine, but you need to understand that this command will only install the bare minimum requirements of LangChain. A lot of the value of LangChain comes when integrating it with various model providers, data stores, etc.
Grab your API Key from Orquesta (https://my.orquesta.dev/<workspace-name>/settings/developers )
which will be used to create a client instance.
import os
import time
from orquesta_sdk import OrquestaClient, OrquestaClientOptions
from orquesta_sdk.prompts import OrquestaPromptMetrics, OrquestaPromptMetricsEconomics
from orquesta_sdk.helpers import orquesta_openai_parameters_mapper
from langchain.schema import AIMessage, HumanMessage, SystemMessage
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import get_openai_callback
Explanation:
Import the time module to calculate the total time for the program to run.
The
OrquestaClient
and theOrquestaClientOptions
classes which are already defined in theorquesta_sdk
module are imported.To be able to log all the interactions with the LLM, we use the
OrquestaPromptMetrics
class.Orquesta has many helper functions that map and interface between Orquesta and specific LLM provider, for this integration, we will make use of the
orquesta_openai_parameters_mapper
helper.The AIMessage class is a message from an AI, HumanMessage is a message from a human, and the SystemMessage is a message for priming AI behaviour, usually passed in as the first of a sequence of input messages.
The
ChatOpenAI
class is an OpenAI Chat large language models API. To be able to use it, you should have the OpenAI Python package installed and the environment variableOPENAI_API_KEY
set with your API key.
# Initialize Orquesta client
from orquesta_sdk import OrquestaClient, OrquestaClientOptions
api_key = "ORQUESTA-API-KEY"
options = OrquestaClientOptions(api_key=api_key, ttl=3600)
client = OrquestaClient(options)
An instance of the
OrquestaClient
class is created and initialized with the previously configured options object. This client instance can now interact with the Orquesta service using the provided API key for authentication.In the next line of code, we create the instance of the OrquestaClientOptions and configure it with the
api_key
and thettl
(Time to Live) in seconds for the local cache; by default, it is 3600 seconds (1 hour).
Step 2 - Set up a chat prompt
Set up your chat prompt in the Orquesta dashboard. Make sure it is a chat prompt and not a completion prompt. Set your prompt key and domain (if you have any), and Publish.
Once that is set up, create your first chat prompt, give it a name prompt, and add all the necessary information. Click on Save.
As you can see from the screenshot, the prompt message is “What is a good name for a company that makes good beard oil”, and the model is openai/gpt-3.5-turbo. Click Save.
Step 3 - Request a variant from Orquesta
To request a specific variant from your newly created prompt, the Code Snippet Generator can easily generate the code for a prompt variant by right-clicking on the prompt or opening the Code Snippet component.
Copy the code snippet and paste it into your editor.
prompt = client.prompts.query(
key="customer-support-chat",
context={"environments": ["test"]},
variables={"customer_name": ""},
metadata={"chain-id": "js2938js2ja"},
)
Step 4 - Transform the message into LangChain format
The prompt from Orquesta is transformed into a format to pass into LangChain.
# Start time of the completion request
start_time = time.time()
print(f'Start time: {start_time}')
messages = []
for message in prompt.value.get("messages", []):
role = message.get("role")
content = message.get("content")
if role == "system":
messages.append(SystemMessage(content=content))
elif role == "user":
messages.append(HumanMessage(content=content))
elif role == "assistant":
messages.append(AIMessage(content=content))
parameters = orquesta_openai_parameters_mapper(prompt.value)
chat = ChatOpenAI(
temperature=parameters.get("temperature"),
max_tokens=parameters.get("max_tokens"),
openai_api_key="api_key",
)
with get_openai_callback() as cb:
result = chat(messages)
# End time of the completion request
end_time = time.time()
print(f"End time: {end_time}")
print(result.content)
# Calculate the difference (latency) in milliseconds
latency = (end_time - start_time) * 1000
print(f'Latency is: {latency}')
economics = OrquestaPromptMetricsEconomics(
total_tokens=cb.total_tokens,
completion_tokens=cb.completion_tokens,
prompt_tokens=cb.prompt_tokens,
)
# Report the metrics back to Orquesta
metrics = OrquestaPromptMetrics(
economics=economics,
llm_response=result.content,
latency=latency
)
prompt.add_metrics(metrics=metrics)
Explanation
Initialize an empty list named
messages
, which will store message objects.A
for
loop iterates through the list of messages obtained fromprompt.value
. If no messages are found, an empty list is used as a default value.Within the loop, the code extracts the
role
andcontent
attributes from each message.Depending on the
role
of the message ("system", "user", or "assistant"), a message object is created and appended to the messages list.Pass in the value of the prompt into the Orquesta OpenAI helper and store them in the parameters variable.
A
ChatOpenAI
object is created with specified parameters, including the temperature and maximum tokens, which affect the behaviour of the language model. Theopenai_api_key
is provided as an argument.The
chat
object is invoked with themessages
list as an argument. This processes the messages using the language model and generates a response.
Finally, the content of the response generated by the language model is printed to the console.
The response from the LLM is “Beard Bliss”.
Wrap up
In conclusion, the integration of Orquesta SDK with LangChain brings forth a powerful synergy that amplifies the capabilities of both platforms, and you have been able to set up a prompt in Orquesta, create a client, connect with LangChain, and get a response from the LangChain OpenAI API.
Links
Check out Orquesta documentation.
Full code
Here is the full code for this tutorial.
import os
import time
from orquesta_sdk import OrquestaClient, OrquestaClientOptions
from orquesta_sdk.prompts import OrquestaPromptMetrics, OrquestaPromptMetricsEconomics
from orquesta_sdk.helpers import orquesta_openai_parameters_mapper
from langchain.schema import AIMessage, HumanMessage, SystemMessage
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import get_openai_callback
# Initialize Orquesta client
from orquesta_sdk import OrquestaClient, OrquestaClientOptions
api_key = "ORQUESTA-API-KEY"
options = OrquestaClientOptions(api_key=api_key, ttl=3600)
client = OrquestaClient(options)
prompt = client.prompts.query(
key="customer-support-chat",
context={"environments": ["test"]},
variables={"customer_name": ""},
metadata={"chain-id": "js2938js2ja"},
)
# Start time of the completion request
start_time = time.time()
print(f'Start time: {start_time}')
messages = []
for message in prompt.value.get("messages", []):
role = message.get("role")
content = message.get("content")
if role == "system":
messages.append(SystemMessage(content=content))
elif role == "user":
messages.append(HumanMessage(content=content))
elif role == "assistant":
messages.append(AIMessage(content=content))
parameters = orquesta_openai_parameters_mapper(prompt.value)
chat = ChatOpenAI(
temperature=parameters.get("temperature"),
max_tokens=parameters.get("max_tokens"),
openai_api_key="api_key",
)
with get_openai_callback() as cb:
result = chat(messages)
# End time of the completion request
end_time = time.time()
print(f"End time: {end_time}")
print(result.content)
# Calculate the difference (latency) in milliseconds
latency = (end_time - start_time) * 1000
print(f'Latency is: {latency}')
economics = OrquestaPromptMetricsEconomics(
total_tokens=cb.total_tokens,
completion_tokens=cb.completion_tokens,
prompt_tokens=cb.prompt_tokens,
)
# Report the metrics back to Orquesta
metrics = OrquestaPromptMetrics(
economics=economics,
llm_response=result.content,
latency=latency
)
prompt.add_metrics(metrics=metrics)
Top comments (0)