チャット¶

チャットは、プロンプトを送信してLLMから回答を受信するアクティビティです。 A chat is a collection of chat prompts. Once you have set the configuration for your LLM, send it prompts (from the entry box in the lower part of the panel) to determine whether further refinements are needed before considering your LLM blueprint for deployment.

プレイグラウンド内でのチャットは「会話」で、それに続くプロンプトでフォローアップの質問をすることができます。以下は、DataRobotオートパイロットを実行するためのPythonコードの出力をLLMに求める例です。

The results of the follow-up questions are dependent on whether context awareness is enabled (see continuation of the example). プレイグラウンドを使用して、システムプロンプトと設定に満足するまでプロンプトをテストおよびチューニングします。 Then, click Save configuration in the bottom of the right-hand panel.

Context-aware chatting¶

When configuring an LLM blueprint, you set the history awareness in the Prompting tab.

There are two states of context. They control whether chat history is sent with the prompt to include relevant context for responses.

状態	説明
コンテキスト認識	When sending input, previous chat history is included with the prompt. This state is the default.
コンテキストがありません	チャットからの履歴なしで、各プロンプトを独立した入力として送信します。

You can switch between one-time (no context) and context-aware within a chat. They each become independent sets of history context—going from context-aware, to no context, and back to aware clears the earlier history from the prompt. (This only happens once a new prompt is submitted.)

Context state is reported in two ways:

A badge, which displays to the right of the LLM blueprint name in both configuration and comparison views, reports the current context state:
In the configuration view, dividers show the state of the context setting:

Using the example above, you could then prompt to make a change to "that code." With context-aware enabled, the LLM responds knowing the code being referenced because it is "aware" of the previous conversation history:

Few-shotプロンプティングのセクションも参照してください。

Single vs comparison chats¶

Chatting with a single LLM blueprint is a good way to tune before starting prompt comparisons with other LLM blueprints. Comparison lets you compare responses between LLM blueprints to help decide which to move to production.

備考

You can only do comparison prompting with LLM blueprints that you created. To see the results of prompting another user’s LLM blueprint in a shared Use Case, copy the blueprint and then you can chat with the same settings applied. This is intentional behavior because prompting a an LLM blueprint impacts the chat history, which can impact the responses that are generated. However, you can provide response feedback to assist development.

Single LLM blueprint chat¶

When you first configure an LLM blueprint, part of the creation process includes chatting. Set the configuration, and save, to activate chatting:

After seeing chat results, tune the configuration, if desired, and prompt again. Use the additional actions available within each chat result to retrieve more information and the prompt:

オプション	説明
設定を表示	Shows the configuration used by that prompt in the Configuration panel on the right. If you haven't changed configurations while chatting, no change is apparent. Using this tool allows you to recall previous settings and restore the LLM blueprint to those settings.
追跡を開く	Opens the tracing log, which shows all components and prompting activity used in generating LLM responses.
プロンプトと応答を削除	Removes both the prompt and response from the chat history. If deleted, they are no longer considered as context for future responses.

As you send prompts to the LLM, DataRobot maintains a record of those chats. You can either add to the context of an existing chat or start a new chat, which does not carry over any of the context from other chats in the history:

Starting a new chat allows you to have multiple independent conversation threads with a single blueprint. In this way, you can evaluate the LLM blueprint based on different types of topics, without bringing in the history of the previous prompt response, which could "pollute" the answers. While you could also do this by switching context off, submitting a prompt, and then switching it back on, starting a new chat is a simpler solution.

Click Start new chat to begin with a clean history; DataRobot will rename the chat from New chat to the words from your prompt once the prompt is submitted.

Comparison LLM blueprint chat¶

Once you are satisfied, click Comparison in the breadcrumbs to compare responses with other LLM blueprints.

If you determine that further tuning is needed after having started a comparison, you can still modify the configuration of individual LLM blueprints:

To compare LLM blueprint chats side-by-side, see the LLM blueprint comparison documentation.

Response feedback¶

Use the response feedback "thumbs" to rate the prompt answer. Responses are recorded in the Tracing, tab User feedback column. The response, as part of the exported feedback sent to the AI Catalog, can be used, for example, to train a predictive model.

引用¶

A citation is a metric and is on by default (as are Latency, Prompt Tokens, and Response Tokens). Citations provide a list of the top reference document chunks, based on relevance to the prompt, retrieved from VDB. Be aware that the embedding model used to create the VDB in the first place can affect the quality of the citations retrieved.

備考

Citations only appear when the LLM blueprint being queried has an associated VDB. While citations are one of the available metrics, you do not need the assessment functionality enabled to have citations returned.

Use citations as a safety check to validate LLM responses. While they help to validate LLM responses, citations also allow you to validate proper and appropriate retrieval from the VDB—are you retrieving the chunks from your docs that you want to provide as context to the LLM? Additionally, if you enable the Faithfulness metric, which measures whether the LLM response matches the source, it relies on the citation output for its relevance.

信頼性スコア¶

信頼性スコアは、事実の一貫性指標アプローチを使用して計算されます。一方、類似性スコアは、ベクターデータベースから取得した事実と、LLMブループリントから生成されたテキストを使用して計算されます。使用される類似性指標は ROUGE-1です。 DataRobot GenAIは、 "The limits of automatic summarization according to ROUGE"からのインサイトに基づいて、ROUGE-1の改善されたバージョンを使用します。

一歩進んだ操作：Few-shotプロンプティング¶

Few-shotプロンプティングは、「コンテキスト内学習」で限られた数の例またはプロンプトに基づいてテキストを生成または分類する手法です。実例、すなわち「ショット」は、与えられたコンテキストのパターンにモデルが従うように条件を設定します。トレーニング中に同様の実例を見たことがなくても、コンテキストに関連する一貫したテキストを生成できます。これは、モデルにラベル付けされた大量のトレーニングデータが通常必要となる従来の機械学習とは異なります。 Few-shotプロンプティングでは、特定のデータセットでファインチューニングを行う必要なく、モデルをテキスト生成、テキストサマリー、変換、質問への回答、センチメント分析などのタスクに適した候補にすることができます。

Few-shotプロンプティングの簡単な例は、顧客のフィードバックを肯定的または否定的に分類する際に使用します。モデルに肯定的と否定的のフィードバックの例を3つ表示することによって、分類されていないフィードバックを表示したときに、最初の3つの例に基づいてモデルが評価を割り当てることができます。 Few-shotプロンプティングは、モデルに2つ以上の例を表示するときに使用します。Zero-shotとOne-shotのプロンプティングは、同様の手法です。

以下は、DataRobotでのFew-shotプロンプティングの使用を示しています。システムプロンプトフィールドで、プロンプトと学習例が表示されます。

カスタマーサポートチケットのテキストから、参照する製品の名前と問題のタイプを特定します。 問題のタイプは"hardware"または"software"です。 "product"と"issue type"の2つのキーを持つJSONとしてレスポンスをフォーマットします。

---------------
例：

入力：TPS Report Generator Enterprise Editionでバグに遭遇しました。 「生成」をクリックするたびにアプリケーションがクラッシュします。 利用可能な更新または修正はありますか？
出力：{"product": "TPS Report Generator Enterprise Edition", "issue_type": "software"}

入力：Acme Phone 5+で画面がちらつき、使用できません。 どうすればよいでしょうか？ いくつかのゲームをいくつかインストールしたいと思い、問題が解決することを期待して工場出荷時設定へのリセットを行いましたが、解決しませんでした。

出力：{"product": "Acme Phone 5+", "issue_type": "hardware"}¶

LLMにそのコンテキストを入力した後、いくつかのプロンプトの例を試してください。

プロンプト：PrintPro 9000の画面に異様なエラーメッセージが表示されています。 "PC LOAD LETTER"と書かれてあります。これは何を意味しますか？

プロンプト：Print Pro 9002にファームウェアv12.1をインストールできません。"Incompatible product version"と書かれています。

プロンプト：PrintPro 9001に奇妙な異音が発生し、正常に機能していません。この件についてサポートをお願いします。

ドラフトをLLMブループリントとして保存して登録し、本番環境に保存します。

詳細については、 MIT Prompt Engineering Guideを参照してください。

更新しました May 14, 2024

このページは役に立ちましたか？

ありがとうございます。どのような点が役に立ちましたか？

より良いコンテンツを提供するには、どうすればよいでしょうか？

アンケートにご協力いただき、ありがとうございました。