Managing Data in Klu

As you begin to utilize LLMs in a production environment, the need for an efficient system to manage and leverage the generated data becomes apparent. Klu allows you to easily import your existing data via UI or SDK, manage and filter feedback, and export data whenever necessary.


Importing Existing Data

If you have existing prompt completion data, importing all of it into Klu is easy.

Importing via the UI

Navigate to the Data section for your App and select Import Data from the header menu. We recommend using JSONL as your data format, and provide templates for easy import.

Importing via the SDK

Utilizing the Klu SDK, you can effortlessly import all your data. Here's the process:

  • Set up your Workspace and get your API key
  • Create a new App and Action
  • Set the Action GUID at import time

Backfill Data

from klu import Klu

klu = Klu("YOUR_API_KEY")

input = "What you sent to the LLM"
output = "What the LLM responded with"
action = "action_guid"
feedback = True or False

data = klu.data.create(input=input, output=output, action=action)

if feedback:
  feedback = klu.feedback.create(
    data_guid=data.guid,
    type="rating", # or action or issue or correction
    value="1", # if negative feedback, 2 if positive, Any value for issue, action or correct
    created_by="user_id", # could be a Klu user or an external user
    source="backfill" # or specific source, api by default
  )

print(data)

This enables you to seamlessly integrate all your data into Klu.


Browse and Filter Data

Navigate to the Data section via the sidebar navigation for your App.

Here, you'll find all the generated data points for that application. You can easily filter these to a specific data set.

Currently Klu supports the following filters:

  • Date Range: Choose the time frame for the generated data.
  • Action: Specify the action that triggered the output.
  • Input/Output Search: Search for specific words or phrases in inputs or outputs.
  • Rating: Filter by user feedback, either positive or negative.
  • Source: Identify the origin of the output.
  • Insights: Filter by helpfulness, sentiment, or moderation.
  • Klu User: Determine which user or SDK key generated the output.
  • Feedback Issue: Filter by any present issues such as hallucinations, toxicity, etc.

To preserve the data set for future training or evaluation, you can save it as a named Dataset .


Feedback

  • Name
    Feedback
    Type
    Description

    With Klu you can provide feedback on any output. This feedback is then used to train the model to generate better outputs. From the app, simply press on the thumbs up or thumbs down to flag a output as good or bad.

Within the data section you will see the thumbs up and thumbs down icons. That is the easiest way to mark a output as good or bad.

If you would like to edit the outputs, flag them as inappropriate or add additional context, the easiest way is to click on the icon on the right hand side of the output.

Within that modal you have various options of how to add context:

  • User Behavior - set a user behavior, including Saved, Copied, Shared, or Deleted.
  • Generation Issue - set a generation issue, including Hallucination, Inappropriate, or Repetition.
  • Response Correction - set a correction based on the completion output for a data point.

These measures will significantly aid in future fine-tuning and optimization processes.


Export

Klu ensures your data remains yours, offering the flexibility to export it whenever required.

We currently provide two data export formats:

  • CSV - easily import into a spreadsheet for further analysis
  • JSONL - your data is ready to be used in a different system that accepts JSONL files (e.g. fine tuning or evals).

Data export is also possible through the API or SDK.