Quality Filtering: Removing Sycophantic Responses
Filter out low-quality, overly agreeable responses from a chat log dataset using LLM-generated quality scores.
Overview
Starting with a 200k-row chat log dataset (ultrachat_200k), generate a sycophancy score for each conversation, then filter out highly sycophantic responses.

Steps
- Load the dataset
- Generate sycophancy scores
Open the chat panel on the right-hand side
Use chat to request: "add a 0-1 sycophancy score for each row"
> Note: Hyperparam analyzes each conversation and creates a
sycophancy_scorecolumn (0.0 = authentic, 1.0 = highly sycophantic). - View sorted sample
Select either the first rows or a random sample to view 100 rows of the dataset, which can then be sorted by the generated column
Click the column header to sort table based on ascending or descending sycophancy
- Apply filter
Add filter:
sycophancy_score < 0.2> Note: Keeps only responses with low sycophancy (authentic, non-pandering) responses
- Export filtered dataset
Click export
Enable "Apply current table filters"
Set output filename (e.g.,
ultrachat_200k_filtered.parquet)> Note: Export processes full dataset with filter applied
Expected Results
- Generated column:
sycophancy_scorerating each response's authenticity - Filtered dataset: Only rows with sycophancy score < 0.2, removing overly agreeable responses
- Output: Cleaned parquet file ready for training or further analysis
Other Use Cases
- Dataset Discovery - Use natural language to search and discover datasets
- Data Transformation - Categorize and derive insights from unstructured text
- Patient Data Workflow - Extract, filter, and export structured medical data
- Deep Research — Multi-step AI workflow for dataset research and model comparison