Adaptors - Bayesian Optimization as a Coverage Tool for Evaluating LLM" > Adaptors - Bayesian Optimization as a Coverage Tool for Evaluating LLM" >
Skip to content

Adaptors

bocoel.Adaptor

Bases: Protocol

Adaptors are the glue between scores, langauge models, and the corpus. It is designed to handle running a particular score on a particular corpus / dataset.

evaluate abstractmethod

evaluate(data: Mapping[str, Sequence[Any]]) -> Sequence[float] | NDArray

Evaluate a particular set of entries with a language model. Returns a list of scores, one for each entry, in the same order.

Parameters

data: Mapping[str, Sequence[Any]] A mapping from column names to the data in that column.

lm: LanguageModel The language model to use for evaluation.

Returns

The scores for each entry. Scores must be floating point numbers.

Source code in bocoel/models/adaptors/interfaces/adaptors.py
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
@abc.abstractmethod
def evaluate(self, data: Mapping[str, Sequence[Any]]) -> Sequence[float] | NDArray:
    """
    Evaluate a particular set of entries with a language model.
    Returns a list of scores, one for each entry, in the same order.

    Parameters
    ----------

    `data: Mapping[str, Sequence[Any]]`
    A mapping from column names to the data in that column.

    `lm: LanguageModel`
    The language model to use for evaluation.

    Returns
    -------

    The scores for each entry. Scores must be floating point numbers.
    """

    ...

on_storage

on_storage(storage: Storage, indices: ArrayLike) -> NDArray

Evaluate a particular set of indices on a storage. Given indices and a storage, this method will extract the corresponding entries from the storage, and evaluate them with Adaptor.evaluate.

Parameters

storage: Storage The storage to extract entries from.

lm: LanguageModel The language model to use for evaluation.

indices: ArrayLike The indices to extract from the storage.

Returns

The scores for each entry. Scores must be floating point numbers. The shape of the returned array must be the same as the shape of indices.

Source code in bocoel/models/adaptors/interfaces/adaptors.py
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def on_storage(self, storage: Storage, indices: ArrayLike) -> NDArray:
    """
    Evaluate a particular set of indices on a storage.
    Given indices and a storage,
    this method will extract the corresponding entries from the storage,
    and evaluate them with `Adaptor.evaluate`.

    Parameters
    ----------

    `storage: Storage`
    The storage to extract entries from.

    `lm: LanguageModel`
    The language model to use for evaluation.

    `indices: ArrayLike`
    The indices to extract from the storage.

    Returns
    -------

    The scores for each entry. Scores must be floating point numbers.
    The shape of the returned array must be the same as the shape of `indices`.
    """

    indices = np.array(indices).astype("i")

    # Reshape the indices into 1D to evaluate.
    indices_shape = indices.shape
    indices = indices.ravel()

    items = storage[indices.tolist()]
    result = np.array(self.evaluate(data=items))

    # Reshape back.
    return result.reshape(indices_shape)

on_corpus

on_corpus(corpus: Corpus, indices: ArrayLike) -> NDArray

Evaluate a particular set of indices on a corpus. A convenience wrapper around Adaptor.on_storage.

Source code in bocoel/models/adaptors/interfaces/adaptors.py
86
87
88
89
90
91
92
def on_corpus(self, corpus: Corpus, indices: ArrayLike) -> NDArray:
    """
    Evaluate a particular set of indices on a corpus.
    A convenience wrapper around `Adaptor.on_storage`.
    """

    return self.on_storage(storage=corpus.storage, indices=indices)

bocoel.BigBenchAdaptor

Bases: Adaptor, Protocol

evaluate abstractmethod

evaluate(data: Mapping[str, Sequence[Any]]) -> Sequence[float] | NDArray

Evaluate a particular set of entries with a language model. Returns a list of scores, one for each entry, in the same order.

Parameters

data: Mapping[str, Sequence[Any]] A mapping from column names to the data in that column.

lm: LanguageModel The language model to use for evaluation.

Returns

The scores for each entry. Scores must be floating point numbers.

Source code in bocoel/models/adaptors/interfaces/adaptors.py
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
@abc.abstractmethod
def evaluate(self, data: Mapping[str, Sequence[Any]]) -> Sequence[float] | NDArray:
    """
    Evaluate a particular set of entries with a language model.
    Returns a list of scores, one for each entry, in the same order.

    Parameters
    ----------

    `data: Mapping[str, Sequence[Any]]`
    A mapping from column names to the data in that column.

    `lm: LanguageModel`
    The language model to use for evaluation.

    Returns
    -------

    The scores for each entry. Scores must be floating point numbers.
    """

    ...

on_storage

on_storage(storage: Storage, indices: ArrayLike) -> NDArray

Evaluate a particular set of indices on a storage. Given indices and a storage, this method will extract the corresponding entries from the storage, and evaluate them with Adaptor.evaluate.

Parameters

storage: Storage The storage to extract entries from.

lm: LanguageModel The language model to use for evaluation.

indices: ArrayLike The indices to extract from the storage.

Returns

The scores for each entry. Scores must be floating point numbers. The shape of the returned array must be the same as the shape of indices.

Source code in bocoel/models/adaptors/interfaces/adaptors.py
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def on_storage(self, storage: Storage, indices: ArrayLike) -> NDArray:
    """
    Evaluate a particular set of indices on a storage.
    Given indices and a storage,
    this method will extract the corresponding entries from the storage,
    and evaluate them with `Adaptor.evaluate`.

    Parameters
    ----------

    `storage: Storage`
    The storage to extract entries from.

    `lm: LanguageModel`
    The language model to use for evaluation.

    `indices: ArrayLike`
    The indices to extract from the storage.

    Returns
    -------

    The scores for each entry. Scores must be floating point numbers.
    The shape of the returned array must be the same as the shape of `indices`.
    """

    indices = np.array(indices).astype("i")

    # Reshape the indices into 1D to evaluate.
    indices_shape = indices.shape
    indices = indices.ravel()

    items = storage[indices.tolist()]
    result = np.array(self.evaluate(data=items))

    # Reshape back.
    return result.reshape(indices_shape)

on_corpus

on_corpus(corpus: Corpus, indices: ArrayLike) -> NDArray

Evaluate a particular set of indices on a corpus. A convenience wrapper around Adaptor.on_storage.

Source code in bocoel/models/adaptors/interfaces/adaptors.py
86
87
88
89
90
91
92
def on_corpus(self, corpus: Corpus, indices: ArrayLike) -> NDArray:
    """
    Evaluate a particular set of indices on a corpus.
    A convenience wrapper around `Adaptor.on_storage`.
    """

    return self.on_storage(storage=corpus.storage, indices=indices)

bocoel.BigBenchQuestionAnswer

BigBenchQuestionAnswer(
    lm: GenerativeModel,
    inputs: str = "inputs",
    targets: str = "targets",
    matching_type: str | BigBenchMatchType = BigBenchMatchType.EXACT,
)

Bases: BigBenchAdaptor

Source code in bocoel/models/adaptors/bigbench/matching.py
59
60
61
62
63
64
65
66
67
68
69
70
71
def __init__(
    self,
    lm: GenerativeModel,
    inputs: str = "inputs",
    targets: str = "targets",
    matching_type: str | BigBenchMatchType = BigBenchMatchType.EXACT,
) -> None:
    self.lm = lm

    self.inputs = inputs
    self.targets = targets

    self._score_fn = BigBenchMatchType.lookup(matching_type).score

on_storage

on_storage(storage: Storage, indices: ArrayLike) -> NDArray

Evaluate a particular set of indices on a storage. Given indices and a storage, this method will extract the corresponding entries from the storage, and evaluate them with Adaptor.evaluate.

Parameters

storage: Storage The storage to extract entries from.

lm: LanguageModel The language model to use for evaluation.

indices: ArrayLike The indices to extract from the storage.

Returns

The scores for each entry. Scores must be floating point numbers. The shape of the returned array must be the same as the shape of indices.

Source code in bocoel/models/adaptors/interfaces/adaptors.py
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def on_storage(self, storage: Storage, indices: ArrayLike) -> NDArray:
    """
    Evaluate a particular set of indices on a storage.
    Given indices and a storage,
    this method will extract the corresponding entries from the storage,
    and evaluate them with `Adaptor.evaluate`.

    Parameters
    ----------

    `storage: Storage`
    The storage to extract entries from.

    `lm: LanguageModel`
    The language model to use for evaluation.

    `indices: ArrayLike`
    The indices to extract from the storage.

    Returns
    -------

    The scores for each entry. Scores must be floating point numbers.
    The shape of the returned array must be the same as the shape of `indices`.
    """

    indices = np.array(indices).astype("i")

    # Reshape the indices into 1D to evaluate.
    indices_shape = indices.shape
    indices = indices.ravel()

    items = storage[indices.tolist()]
    result = np.array(self.evaluate(data=items))

    # Reshape back.
    return result.reshape(indices_shape)

on_corpus

on_corpus(corpus: Corpus, indices: ArrayLike) -> NDArray

Evaluate a particular set of indices on a corpus. A convenience wrapper around Adaptor.on_storage.

Source code in bocoel/models/adaptors/interfaces/adaptors.py
86
87
88
89
90
91
92
def on_corpus(self, corpus: Corpus, indices: ArrayLike) -> NDArray:
    """
    Evaluate a particular set of indices on a corpus.
    A convenience wrapper around `Adaptor.on_storage`.
    """

    return self.on_storage(storage=corpus.storage, indices=indices)

bocoel.BigBenchMatchType

Bases: StrEnum

bocoel.BigBenchMultipleChoice

BigBenchMultipleChoice(
    lm: ClassifierModel,
    inputs: str = "inputs",
    multiple_choice_targets: str = "multiple_choice_targets",
    multiple_choice_scores: str = "multiple_choice_scores",
    choice_type: str | BigBenchChoiceType = BigBenchChoiceType.SUM_OF_SCORES,
)

Bases: BigBenchAdaptor

Source code in bocoel/models/adaptors/bigbench/multi.py
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
def __init__(
    self,
    lm: ClassifierModel,
    inputs: str = "inputs",
    multiple_choice_targets: str = "multiple_choice_targets",
    multiple_choice_scores: str = "multiple_choice_scores",
    choice_type: str | BigBenchChoiceType = BigBenchChoiceType.SUM_OF_SCORES,
) -> None:
    self.lm = lm

    self.inputs = inputs
    self.multiple_choice_targets = multiple_choice_targets
    self.multiple_choice_scores = multiple_choice_scores

    self._score_fn = BigBenchChoiceType.lookup(choice_type).score

on_storage

on_storage(storage: Storage, indices: ArrayLike) -> NDArray

Evaluate a particular set of indices on a storage. Given indices and a storage, this method will extract the corresponding entries from the storage, and evaluate them with Adaptor.evaluate.

Parameters

storage: Storage The storage to extract entries from.

lm: LanguageModel The language model to use for evaluation.

indices: ArrayLike The indices to extract from the storage.

Returns

The scores for each entry. Scores must be floating point numbers. The shape of the returned array must be the same as the shape of indices.

Source code in bocoel/models/adaptors/interfaces/adaptors.py
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def on_storage(self, storage: Storage, indices: ArrayLike) -> NDArray:
    """
    Evaluate a particular set of indices on a storage.
    Given indices and a storage,
    this method will extract the corresponding entries from the storage,
    and evaluate them with `Adaptor.evaluate`.

    Parameters
    ----------

    `storage: Storage`
    The storage to extract entries from.

    `lm: LanguageModel`
    The language model to use for evaluation.

    `indices: ArrayLike`
    The indices to extract from the storage.

    Returns
    -------

    The scores for each entry. Scores must be floating point numbers.
    The shape of the returned array must be the same as the shape of `indices`.
    """

    indices = np.array(indices).astype("i")

    # Reshape the indices into 1D to evaluate.
    indices_shape = indices.shape
    indices = indices.ravel()

    items = storage[indices.tolist()]
    result = np.array(self.evaluate(data=items))

    # Reshape back.
    return result.reshape(indices_shape)

on_corpus

on_corpus(corpus: Corpus, indices: ArrayLike) -> NDArray

Evaluate a particular set of indices on a corpus. A convenience wrapper around Adaptor.on_storage.

Source code in bocoel/models/adaptors/interfaces/adaptors.py
86
87
88
89
90
91
92
def on_corpus(self, corpus: Corpus, indices: ArrayLike) -> NDArray:
    """
    Evaluate a particular set of indices on a corpus.
    A convenience wrapper around `Adaptor.on_storage`.
    """

    return self.on_storage(storage=corpus.storage, indices=indices)

numeric_choices staticmethod

numeric_choices(question: str, choices: Sequence[str]) -> str

Convert a multiple choice question into a numeric choice question. Returns a tuple of generated prompt and list of valid choices.

Source code in bocoel/models/adaptors/bigbench/multi.py
101
102
103
104
105
106
107
108
109
110
111
@staticmethod
def numeric_choices(question: str, choices: Sequence[str]) -> str:
    """
    Convert a multiple choice question into a numeric choice question.
    Returns a tuple of generated prompt and list of valid choices.
    """

    return (
        f"{question}\nSelect from one of the following (answer in number):\n"
        + "\n".join(f"{i}) {choice}" for i, choice in enumerate(choices, 1))
    )

bocoel.BigBenchChoiceType

Bases: StrEnum

bocoel.Sst2QuestionAnswer

Sst2QuestionAnswer(
    lm: ClassifierModel,
    sentence: str = "sentence",
    label: str = "label",
    choices: Sequence[str] = ("negative", "positive"),
)

Bases: Adaptor

The adaptor for the SST-2 dataset. This adaptor assumes that the dataset has the following columns: - idx: The index of the entry. - sentence: The sentence to classify. - label: The label of the sentence.

Each entry in the dataset must be a single sentence.

Source code in bocoel/models/adaptors/glue/sst.py
22
23
24
25
26
27
28
29
30
31
32
33
def __init__(
    self,
    lm: ClassifierModel,
    sentence: str = "sentence",
    label: str = "label",
    choices: Sequence[str] = ("negative", "positive"),
) -> None:
    self.lm = lm

    self.sentence = sentence
    self.label = label
    self.choices = choices

on_storage

on_storage(storage: Storage, indices: ArrayLike) -> NDArray

Evaluate a particular set of indices on a storage. Given indices and a storage, this method will extract the corresponding entries from the storage, and evaluate them with Adaptor.evaluate.

Parameters

storage: Storage The storage to extract entries from.

lm: LanguageModel The language model to use for evaluation.

indices: ArrayLike The indices to extract from the storage.

Returns

The scores for each entry. Scores must be floating point numbers. The shape of the returned array must be the same as the shape of indices.

Source code in bocoel/models/adaptors/interfaces/adaptors.py
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def on_storage(self, storage: Storage, indices: ArrayLike) -> NDArray:
    """
    Evaluate a particular set of indices on a storage.
    Given indices and a storage,
    this method will extract the corresponding entries from the storage,
    and evaluate them with `Adaptor.evaluate`.

    Parameters
    ----------

    `storage: Storage`
    The storage to extract entries from.

    `lm: LanguageModel`
    The language model to use for evaluation.

    `indices: ArrayLike`
    The indices to extract from the storage.

    Returns
    -------

    The scores for each entry. Scores must be floating point numbers.
    The shape of the returned array must be the same as the shape of `indices`.
    """

    indices = np.array(indices).astype("i")

    # Reshape the indices into 1D to evaluate.
    indices_shape = indices.shape
    indices = indices.ravel()

    items = storage[indices.tolist()]
    result = np.array(self.evaluate(data=items))

    # Reshape back.
    return result.reshape(indices_shape)

on_corpus

on_corpus(corpus: Corpus, indices: ArrayLike) -> NDArray

Evaluate a particular set of indices on a corpus. A convenience wrapper around Adaptor.on_storage.

Source code in bocoel/models/adaptors/interfaces/adaptors.py
86
87
88
89
90
91
92
def on_corpus(self, corpus: Corpus, indices: ArrayLike) -> NDArray:
    """
    Evaluate a particular set of indices on a corpus.
    A convenience wrapper around `Adaptor.on_storage`.
    """

    return self.on_storage(storage=corpus.storage, indices=indices)