7. Self-consistency with Pytandic Objects using Sherpa#
Self-consistency is a technique that can improve the quality of LLM outputs by considering multiple generation outputs. The original seminal paper on self-consistency focuses on handling simple outputs such as multi-choice answers and numbers. Sherpa extends this idea to arbitrary Pydantic objects.
7.1. Running Self-Consistency#
Let’s say we have the following Pydantic schema:
from pydantic import BaseModel, Field
class Person(BaseModel):
name: str = Field(..., description="The name of the person")
age: int = Field(..., description="The age of the person")
city: str = Field(..., description="The city where the person lives")
And multiple objects have been generated by an LLM on some task that you one to achieve:
objects = [
Person(name="Alice", age=30, city="New York"),
Person(name="Alice", age=31, city="New York"),
Person(name="Alice", age=30, city="Los Angeles"),
]
The current version will identify the most common value for each field of the object, and return a new object with the most common values:
from sherpa_ai.output_parsers.self_consistency import run_self_consistency
# Run self-consistency
result = run_self_consistency(objects, schema=Person)
print(result)
# Output: Person(name='Alice', age=30, city='New York')
7.2. Advanced Configuration#
You can also provide configuration for list attributes using the config parameter:
from sherpa_ai.output_parsers.self_consistency.config import SelfConsistencyConfig, ListConfig
config = SelfConsistencyConfig(
list_config={
"tags": ListConfig(strategy="top_k", top_k=2),
"scores": ListConfig(strategy="threshold", threshold=3.0)
}
)
result = run_self_consistency(objects, schema=Person, config=config)
For more details on the self-consistency process, you can refer to the self-consistency module documentation.
Note
The future version will support adding relational constraints between the fields, so that the self-consistency can be run for more complex use cases.