DEV Community

Cover image for Using a Locally-Installed LLM to Fill in Client Requirement Gaps
Blake Anderson
Blake Anderson

Posted on • Edited on

Using a Locally-Installed LLM to Fill in Client Requirement Gaps

Challenges in Gathering Requirements

In 200+ websites and mobile apps, I have yet to see a client show up with a full set of requirements for their project. In fact, many of our clients are academics who have limited knowledge of software development. More importantly, they often have domain-specific knowledge that far exceeds our own. This can make it challenging to pin down a solid version 1.0 release.

We rely on an interview process using Scenario-Based Design (SBD) to capture domain-specific details and clarify a Minimum-Viable Product. We have developed a worksheet that steps clients through the process, including project goals, key concepts, stakeholders, and well-defined scenarios. However, the interview process can be cumbersome, often taking hours to complete. It is also sensitive to the level of involvement from the client. To streamline this process, we have integrated LLM into the workflow to generate a baseline and allow the client to provide feedback rather than working from a blank canvas.

More recently, we have released an interactive SBD tool to allow clients to work through the process on their own. Within our project system, clients can work through the steps and develop requirements. Since this involves a number of calls to an LLM API, we wanted to reduce the load and costs on our own systems. Instead, we have the client install an LLM locally and we use this instance to generate our scenarios. This makes for an interesting use case, which I thought I would share.

Installing the LLM

To make the process feasible for clients, we rely on Ollama to download and run a local LLM. The basic model is a ~5GB download, and it is sufficiently powerful for what we are trying to accomplish. Within the SBD tool, we provide instructions to get the LLM up and running.

Install instructions presented to user to obtain LLM

Once the LLM is installed, we detect the model and present a status indicator that lets them know they are ready to generate a scenario-based design.

const models = ref<llmModel[]>([])
const ollamaRunning = ref(false)
function getModels() {
  return axios.request<tagResponse>({
    url: "http://localhost:11434/api/tags",
  }).then((rsp) => {
    const tags = rsp.data
    models.value = tags.models
  }).catch((err) => {
    ollamaRunning.value = false
    console.error(err)
  })
}
Enter fullscreen mode Exit fullscreen mode
q-btn(label="LLM Online" color="positive" v-if="models.length > 0" icon="sym_o_assistant_on_hub")
q-btn(label="LLM Offline" color="secondary" v-else @click="dlgInstallInstructions=true")
Enter fullscreen mode Exit fullscreen mode

Status indicator showing online status of locally-installed LLM

Interacting with the API

Making calls to the local instance is fairly straightforward. We use the ollama NPM package to manage calls to the LLM. This gives us the ability to perform chat completions by prompting the LLM. Here is the basic usage of the ollama plugin.

import ollama from 'ollama/browser'
const response = await ollama.chat({
  model: 'llama3.1',
  messages: [{ role: 'user', content: 'Why is the sky blue?' }],
})
console.log(response.message.content)
Enter fullscreen mode Exit fullscreen mode

We can even support streaming to allow clients to view the results of the LLM in real-time.

const message = { role: 'user', content: 'Why is the sky blue?' }
const response = await ollama.chat({ model: 'llama3.1', messages: [message], stream: true })
for await (const part of response) {
  response.value += part.message.content
}
Enter fullscreen mode Exit fullscreen mode

As a real-world example, here is how we generate a project abstract from a basic description.

async function generateAbstract() {
  loading.value = true
  project.value.abstract = ''
  const messages = [
    { role: 'user', content: 'Given the following project description, write a project abstract in 5-8 sentences which describes the problem the project addresses, nature and aims of the software solution, and the expected outcomes. Provide only the abstract without additional details or commentary.' },
    { role: 'user', content: aims.value }
  ]
  const stream = await ollama.chat({ model: props.model, messages: messages, stream: true })
  for await (const part of stream) {
    project.value.abstract += part.message.content
  }
  loading.value = false
}
Enter fullscreen mode Exit fullscreen mode

Prompting the LLM for an abstract

Streaming the response for an abstract

Using LLM for Scenario-Based Design

Since our Scenario-Based Design process involves many steps, we need to propagate project information forward as the client develops specifications. While Ollama has a latent memory system that provides context, we inject additional information that is relevant to gain more control of the process.

messages.push({ role: 'user', content: 'Given the following project DESCRIPTION and OUTLINE, List any important tasks that will help to shape the priority and order of deliverables.' })
messages.push({
      role: 'user', content: `
DESCRIPTION:
${project.value.abstract}
`})
messages.push({
      role: 'user', content: `
OUTLINE:
${timeline.value}
`})
Enter fullscreen mode Exit fullscreen mode

We often want to get structured data out of the LLM, which can be challenging with the base Ollama model (llama3.1). Rather than have the client install a syntax-aware model like codellama, we use a prompting technique to elicit correctly-formatted responses.

It can be challenging to get the LLM to cooperate to generate JSON-formatted data, so we provide a template of what we want to get back.

{
      role: 'system', content: `
Provide only a JSON response starting at <BEGIN_JSON> to <END_JSON> with correct syntax strictly adhering to the following structure.

{
  "name":"Jane Doe",
  "gender":"Female",
  "age":"25",
  "race":"",
  "occupation":"student",
  "education":"bachelor's degree",
  "marital_status":"",
  "income_level":"low",
  "background":"[5-8 sentences about the character's background that led them to use the app with a specific purpose or need]",
  "success_story:"Jane was able to use the app successfully to solve her problem.",
  "media":{
    "image_url":"[a sample image url appropriate to the character]"
  }
}
`
},
Enter fullscreen mode Exit fullscreen mode

In our experience, the response can be unpredictable and does not always format the JSON correctly. While codellama provides more consistent results with code syntax, we did not want our users to have to install another model. Instead, we give the LLM a little help by providing the beginning of the assistant prompt.

messages.push({
      role: 'assistant', content: `Sure! here is the character in the JSON format you requested.

<BEGIN_JSON>
{
  "name":"`})
Enter fullscreen mode Exit fullscreen mode

Using the assistant role to direct the format of the response, we get back the continuation of the chat conversation. We then prepend our expected structure and extract the JSON response between the sentinal tokens <BEGIN_JSON> and <END_JSON>. This more reliably gives us a format we can target and extract the relevant sections of structured data.

let content = '<BEGIN_JSON>{"name":"' + rsp.message.content
json = content.substring(content.indexOf("<BEGIN_JSON>") + 12, content.lastIndexOf("<END_JSON>"))
const char = JSON.parse(json)
Enter fullscreen mode Exit fullscreen mode

Here is the complete generateCharacter function.

async function generateCharacter() {
  loading.value = true
  const messages = [
    {
      role: 'system', content: `
Provide only a JSON response starting at <BEGIN_JSON> to <END_JSON> with correct syntax strictly adhering to the following structure.

{
  "name":"Jane Doe",
  "gender":"Female",
  "age":"25",
  "race":"",
  "occupation":"student",
  "education":"bachelor's degree",
  "marital_status":"",
  "income_level":"low",
  "background":"[5-8 sentences about the character's background that led them to use the app with a specific purpose or need]",
  "success_story:"Jane was able to use the app successfully to solve her problem.",
  "media":{
    "image_url":"[a sample image url appropriate to the character]"
  }
}
`
    },
    { role: 'user', content: 'Given the following PROJECT, create a CHARACTER as described below.' },
    { role: 'user', content: 'PROJECT: \n' + project.value.abstract + '\n\n' },
    { role: 'user', content: 'CHARACTER: \n' + character.value },
    {
      role: 'assistant', content: `Sure! here is the character in the JSON format you requested.

<BEGIN_JSON>
{
  "name":"`}
  ]
  const rsp = await ollama.chat({ model: props.model, messages: messages, stream: false, options: { temperature: 1.5 } })
  let json = ""
  let content = '<BEGIN_JSON>{"name":"' + rsp.message.content
  try {
    json = content.substring(content.indexOf("<BEGIN_JSON>") + 12, content.lastIndexOf("<END_JSON>"))
    const char = JSON.parse(json)
    char.stakeholder = characterStakeholder.value
    project.value.characters.push(char)
    charactersError.value = false
    dlgAICharacters.value = false
  } catch (error) {
    charactersError.value = true
    project.value.timeline = []
  }
  loading.value = false
}
Enter fullscreen mode Exit fullscreen mode

Prompt to generate a character for SBD

The resulting character with complete backstory

As you can see, the tool produces a rich set of requirements which the client can then modify to suit their needs. We have found this to be an effective (and fun) way to gather requirements from even the most reluctant or indecisive client. With a little assistance during the interview process, we have significantly streamlined requirements gathering and produced much more reliable results. Hopefully, these examples can help you create similar solutions that leverage the rapidly-evolving and incredibly powerful AI technologies that are becoming an essential tool for modern software development.

Top comments (0)